Is chatgpt based on a neural network?

13.6k

Based on the neural network architecture, it is designed to process and generate responses for any sequence of characters that makes sense, including different spoken languages, programming languages and mathematical equations. ChatGPT is a version of GPT-3, a large language model also developed by OpenAI. Language models are a type of neural network that has been trained with lots and lots of text. Neural networks are software inspired by the way in which neurons in animal brains send signals to each other.

Recurrent neural networks, invented in the 1980s, can manage word sequences, but they take time to train and can forget previous words in a sequence. In 1997, computer scientists Sepp Hochreiter and Jürgen Schmidhuber solved this problem by inventing LSTM (long-term memory) networks, recurrent neural networks with special components that allowed data from the past to be retained in an input sequence for longer. LSTMs could handle text strings of several hundred words, but their language skills were limited. On the other hand, if ChatGPT reproduces novel text from the conversation, then, ipso facto, ChatGPT is not a neural network.