The model was trained using text databases from the Internet. This included a whopping 570 GB of data obtained from books, web texts, Wikipedia, articles and others written on the Internet. To be even more exact, 300 billion words were introduced into the system. ChatGPT has been optimized based on GPT-3.5, a language model designed to produce text.
ChatGPT was optimized for dialogue using reinforcement learning with human feedback (RLHF), a method that uses human demonstrations and comparison of preferences to guide the model to the desired behavior.