OpenAI says it’s “impossible” to create useful AI models without copyrighted material

sculd@beehaw.org · 10 months ago

OpenAI says it’s “impossible” to create useful AI models without copyrighted material

Pete Hahnloser@beehaw.org · 10 months ago

Any reasonable person can reach the conclusion that something is wrong here.

What I’m not seeing a lot of acknowledgement of is who really gets hurt by copyright infringement under the current U.S. scheme. (The quote is obviously directed toward the UK, but I’m reasonably certain a similar situation exists there.)

Hint: It’s rarely the creators, who usually get paid once while their work continues to make money for others.

Let’s say the New York Times wins its lawsuit. Do you really think the reporters who wrote the infringed-upon material will be getting royalty checks to be made whole?

This is not OpenAI vs creatives. OK, on a basic level it is, but expecting no one to scrape blogs and forum posts rather goes against the idea of the open internet in the first place. We’ve all learned by now that what goes on the internet stays there, with attribution totally optional unless you have a legal department. What’s novel here is the scale of scraping, but I see some merit to the “transformational” fair-use defense given that the ingested content is not being reposted verbatim.

This is corporations vs corporations. Framing it as millions of people missing out on what they’d have otherwise rightfully gotten is disingenuous.

MudMan@kbin.social · 10 months ago

Yep. The effect of this as currently framed is that you get data ownership clauses in EULAs forever and only major data brokers like Google or Meta can afford to use this tech at all. It’s not even a new scenario, it already happened when those exact companies were pushing facial recognition and other big data tools.

I agree that the basics of modern copyright don’t work great with ML in the mix (or with the Internet in the mix, while we’re at it), but people are leaning on the viral negativity to slip by very unwanted consequences before anybody can make a case for good use of the tech.

Pup Biru@aussie.zone · 10 months ago

it’s so baffling to me that some people think this is a clear cut problem of “you stole the work just the same as if you sold a copy without paying me!”

it ain’t the same folks… that’s not how models work… the outcome is unfortunate, for sure, but to just straight out argue that it’s the same is ludicrous… it’s a new problem and ML isn’t going away, so we’re going to have to deal with it as a new problem