I have… feelings about LLMs being the big thing in AI/ml right now… because its really not much new. Maybe the transformer model kind of but ultimately LLMs are massive supervised learning neural nets trained on obscene amounts of data. And then other models use that pretrained “foundational model” to work and just tune their parameters. Which is why prompt engineer is becoming a thing.
Corpos are playing by the book here and trying to extinguish any competition before it begins by having people rely on their “foundation” models instead of innovating their own solutions
How many tutorials can you find for implementing LLM NLP tasks that dont include “import this model from X company” id wager its only maybe 33%
Part of what makes localized model engines and custom ML chips interesting is precisely their ability to enable small custom local models. Right now LLMs require so much computational power and massive amounts of data to be trained and operate that even the most expensive options lose money with every prompt query.
So, the reason every tutorial starts with “download this model”. Is because there’s a good chance you don’t have the hundreds of super computer cluster chips and the several hundreds of exabytes of scrapped and curated data needed to train a natural language processing model. There’s a reason there are only big players in this game.
Even if you could design your own model… How do you acquire a dataset even a fraction of the size those pretrained models from the corps.
Then how do you train the model in a reasonable time. Other than relying on cloud computing which leads to the same problem of only corps can play this game properly right now.
I designed and collected/labeled the data for a relatively small deep CNN for my masters thesis and training it on 60000 images was taking over a dozen hours (this was 5 years ago at this point so that part may be misremembered) on a 1080ti.
I have… feelings about LLMs being the big thing in AI/ml right now… because its really not much new. Maybe the transformer model kind of but ultimately LLMs are massive supervised learning neural nets trained on obscene amounts of data. And then other models use that pretrained “foundational model” to work and just tune their parameters. Which is why prompt engineer is becoming a thing.
Corpos are playing by the book here and trying to extinguish any competition before it begins by having people rely on their “foundation” models instead of innovating their own solutions
How many tutorials can you find for implementing LLM NLP tasks that dont include “import this model from X company” id wager its only maybe 33%
Part of what makes localized model engines and custom ML chips interesting is precisely their ability to enable small custom local models. Right now LLMs require so much computational power and massive amounts of data to be trained and operate that even the most expensive options lose money with every prompt query.
So, the reason every tutorial starts with “download this model”. Is because there’s a good chance you don’t have the hundreds of super computer cluster chips and the several hundreds of exabytes of scrapped and curated data needed to train a natural language processing model. There’s a reason there are only big players in this game.
Facts.
Even if you could design your own model… How do you acquire a dataset even a fraction of the size those pretrained models from the corps.
Then how do you train the model in a reasonable time. Other than relying on cloud computing which leads to the same problem of only corps can play this game properly right now.
I designed and collected/labeled the data for a relatively small deep CNN for my masters thesis and training it on 60000 images was taking over a dozen hours (this was 5 years ago at this point so that part may be misremembered) on a 1080ti.