- cross-posted to:
- technology@lemmit.online
- cross-posted to:
- technology@lemmit.online
If you read the article you find this was a dataset from a nonprofit, available to anyone. The nonprofit used captions from a set of YouTube videos.
“Most of the Pile’s datasets are accessible and open for anyone on the internet with enough space and computing power to access them.”
That anyone included a lot of other big names in tech, not just Apple.
Also I wasn’t aware that Apple had its own AI. I thought they were licensing stuff from others like OpenAI. I guess maybe this is some research project for an unannounced project?
iirc some of apples meant to be ran device had open source models. that was probably done to get more users into wanting to build into it.
And…?
Not defending Apple here, but everyone with a vested interest in AI is doing it. Nobody is asking permission or respecting copyright in this race to the bottom.
If you post it publicly, expect it to be used publicly.
I know a lot of people are down voting your comment, but I want you to know they are down voting the idea that companies treat public content like public property.
You shouldn’t be down voted for pointing that out.
Its a problem with how we categorise content as either private or public without regard to copyright.
It seems copyright is for big companies like Disney, but a YouTube creator isnt afforded the same protection for their creation. They are merely providing “content” no intellectual property.
Anyway, I get what you were saying.
No consent? I bet Google said yes.
Right? I think people may be surprised as to what the contracts they agreed to say and whose consent on these platforms is needed. Sad but true.
Shocking!
they don’t respect us? gasp!
Yeah?
AI models have been trained on every comment and JPG on the internet… and commercial movies on DVD… and every book in the library.
Shoveling all of the content through a sluice of linear algebra is pretty dang transformative. The more they use, the less any piece matters.
Oh look apple biased mkbhd
Children in libraries are learning to read using books without consent from the authors
Children don’t make millions by selling copies of all the books they skimmed.
Most children don’t (sick burn against the Grimm Brothers). I mean, fuck Apple and all of these companies, but they’re hoovering data from a publicly available resource using totally legal means.
I know I’m snowballing here, but overreacting to this headline could end up supporting those who argue that web crawlers, plane-tracking bots, and the completely legal actions of Aaron Swartz that the Feds tried using to crucify him.
Once again, fuck Apple, but the real villain in this scenario is either Google for allowing companies to train their AI models on their content, or the content creators who are still using YouTube.
Since I can’t fault anyone who is trying to make a living by exploring Google, then I guess I’ll just add “fuck Google” to the pile.
As long as AI also watched ads it’s a fair game.
Just use an ad blocker already
Guess that means the bots aren’t watching the ads?
Great. Not looking forward to when the AIs try to convince me there are loads of lonely hot singles in my area.