Once you’ve trained your large language model on the entire written output of humanity, where do you go? Here’s Ilya Sutskever, ex-OpenAI, admitting to Reuters that they’ve plateaued: [Reuters] The…
LLMs are quite impressive as chatbots all things considered. The conversations with them are way more realistic and almost as funny as the ones with the IRC markov chain my friend made as a freshman CS student.
Of course, out bot’s training data only included the IRC channel’s logs of a few years and the Finnish Bible we later threw in for shits and giggles. A training set of approximately zero terabytes in total.
LLMs are less a marvel of machine learning algorithms (though I admit they might play a part) and more one of data scraping. Based on their claims, they have already dug through the vast majority of publicly accessible world wide web, so where do you go from there? Sure, there are a lot of books that are not on the web, but feeding them in the machine is about as hard as getting them on the web to begin with.
Senior year of college, I took an elective seminar on interactive fiction. For the final project, one of my classmates wrote a program that scraped a LiveJournal and converted it into a text adventure game.
My own final project was a parody of the IMDb that was “what if the IMDb was about books instead of movies”, except that the user reviews told stories about people who turned out to have all gone to high school together before scattering around the world, and reading them in the right sequence unlocked a finale in which they reunited for a New Year’s party and their world dissolved so that their author could repurpose them for other stories.
fuck yes, why wasn’t my college this cool? all I got was an AI elective taught by a guy whose proudest achievement was having the only remaining Genera license on campus
LLMs are quite impressive as chatbots all things considered. The conversations with them are way more realistic and almost as funny as the ones with the IRC markov chain my friend made as a freshman CS student.
Of course, out bot’s training data only included the IRC channel’s logs of a few years and the Finnish Bible we later threw in for shits and giggles. A training set of approximately zero terabytes in total.
LLMs are less a marvel of machine learning algorithms (though I admit they might play a part) and more one of data scraping. Based on their claims, they have already dug through the vast majority of publicly accessible world wide web, so where do you go from there? Sure, there are a lot of books that are not on the web, but feeding them in the machine is about as hard as getting them on the web to begin with.
Senior year of college, I took an elective seminar on interactive fiction. For the final project, one of my classmates wrote a program that scraped a LiveJournal and converted it into a text adventure game.
You are standing in an open field west of a giant castle, with an ornate front door. Your name is Ebony Dark'ness Dementia Raven Way. >
My own final project was a parody of the IMDb that was “what if the IMDb was about books instead of movies”, except that the user reviews told stories about people who turned out to have all gone to high school together before scattering around the world, and reading them in the right sequence unlocked a finale in which they reunited for a New Year’s party and their world dissolved so that their author could repurpose them for other stories.
fuck yes, why wasn’t my college this cool? all I got was an AI elective taught by a guy whose proudest achievement was having the only remaining Genera license on campus