OpenAI Says It's Fine to Vacuum Up Everyone's Content and Charge for It Without Paying Them

Yuritopiaposadism [none/use name]@hexbear.net · 10 months ago

OpenAI Says It's Fine to Vacuum Up Everyone's Content and Charge for It Without Paying Them

jonne · edit-2 10 months ago

I mean, isn’t it basically an automated way of doing the thing that was always legal within current copyright laws? You read a few books/articles on a subject, and you write your own content based on the information you ingested as part of that research? You’ve never needed to pay anyone to do this in the past.

The points about attribution depend mostly about the context: in academia or when writing a book, you attribute, in other contexts (like talks), you generally don’t. And either way those aren’t legal requirements except for certain Creative Commons licences.

I’m not necessarily against updating laws to deal with this issue, but we have to be careful about not undermining fair use provisions, which are already under attack by automated systems that can’t tell whether something’s being used under a fair use provision or not.

WithoutFurtherBelay@hexbear.net · 10 months ago

You read a few books/articles on a subject, and you write your own content based on the information you ingested as part of that research? You’ve never needed to pay anyone to do this in the past.

Maybe this is the same as LLM if you have no lived, internal experience

KobaCumTribute [she/her]@hexbear.net · edit-2 10 months ago

The problem is that increasing the overreach of copyright further like that is bad for everyone and does nothing to curtail the actual problems of AI generation.

The problem of AI content generation is that it’s an infinite vapid slop factory that spouts gibberish forever and without any sort of sanity checking, and whether it correctly paid the corporate owners of the various platforms the generators’ creators scraped or not is irrelevant. All that achieving “you must acquire an explicit license to authorize AI training on a given body of data” means is that a bunch of oligarch ghouls get a payday and the AI is only trained on corporate owned data by big AI corporations that can work out deals with them (or you get a bunch of in-house shit like how they all want their own streaming services).

The only solution is to make AI generated content a poison pill for copyright in general, regardless of whether that “makes sense” or “has precedent”: any work that contains generative AI content becomes uncopyrightable and all licenses attached to it become public domain. Protect silly hobbyist work while harshly punishing corporate use.

To put it bluntly, if something like DisneyExtraGeneratorAI makes a crowd scene it doesn’t matter if it made them from scraping stock photos or by paying Disney adults in ride fastpass priority and a coupon for a slightly discounted $1000 cocktail at their resort for some full body scans, the problem isn’t the licensing it’s the fact that they’re replacing actors and getting an engine they can further profit from by licensing. It’s a new sort of enclosure whether they’re paying to do it or not, and the only way to stop it is to make it impossible to profit from using.

comrade_pibb [comrade/them]@hexbear.net · 10 months ago

make it impossible to profit from using

which is why this will never ever happen

jonne · 10 months ago

Thanks for engaging with the content of the post instead of doing a lazy at hominem that doesn’t even attempt to refute anything I said.

ShimmeringKoi [comrade/them]@hexbear.net · edit-2 10 months ago

Pointing out that what you said demonstrates a complete ignorance of how LLMs are used does not constitute an ad hominem. And even if it had, who gives a shit, this isn’t high school debate club. If they want to call you a dork, no hall monitor is going to pull them up on it.

Maoo [none/use name]@hexbear.net · 10 months ago

I mean, isn’t it basically an automated way of doing the thing that was always legal within current copyright laws?

Nope. You are smarter and more creative than an LLM. LLMs don’t understand their material they just copy patterns and do substitutions.

It’s more like doing a good job at plagiarism. Take 3 sources about the idea and copy them, then switch up the words. Also just add some random sentences of dubious quality that sound right but who knows they’re probably lies. But they sure do look like things humans wrote in the neighborhood of this topic.

jonne · edit-2 10 months ago

That’s my point, if you do exactly that as a human, you’re fine from a copyright standpoint. The LLM created a new work, it’s quoting existing work under fair use, etc.

It’s completely fraudulent in an academic setting, and it’s a way to get around anyone getting paid for their original labour, but that’s not what we’re talking about.