cross-posted from: https://nom.mom/post/121481
OpenAI could be fined up to $150,000 for each piece of infringing content.https://arstechnica.com/tech-policy/2023/08/report-potential-nyt-lawsuit-could-force-openai-to-wipe-chatgpt-and-start-over/#comments
Not trying to argue or troll, but I really don’t get this take, maybe I’m just naive though.
Like yea, fuck Big Data, but…
Humans do this naturally, we consume data, we copy data, sometimes for profit. When a program does it, people freak out?
edit well fuck me for taking 10 minutes to write my comment, seems this was already said and covered as I was typing mine lol
It’s just a natural extension of the concept that entities have some kind of ownership of their creation and thus some say over how it’s used. We already do this for humans and human-based organizations, so why would a program not need to follow the same rules?
Because we don’t already do this. In fact, the raw knowledge contained in a copyrighted work is explicitly not copyrighted and can be done with as people please. Only the specific expression of that knowledge can be copyrighted.
An AI model doesn’t contain the copyrighted works that went into training it. It only contains the concepts that were learned from it.
There’s no learning of concepts. That’s why models hallucinate so frequently. They don’t “know” anything, they’re doing a lot of math based on what they’ve seen before and essentially taking the best guess at what the next word is.
There very much is learning of concepts. This is completely provable. You can give it problems it has never seen before and it will come up with good solutions.
So if I make an AI of the Google name and logo, it’s cool? I’m pretty sure it’s not.