Wikipedia has a new initiative called WikiProject AI Cleanup. It is a task force of volunteers currently combing through Wikipedia articles, editing or removing false information that appears to have been posted by people using generative AI.

Ilyas Lebleu, a founding member of the cleanup crew, told 404 Media that the crisis began when Wikipedia editors and users began seeing passages that were unmistakably written by a chatbot of some kind.

  • kibiz0r@midwest.social
    link
    fedilink
    English
    arrow-up
    4
    ·
    10 minutes ago

    Unleashing generative AI on the world was basically the information equivalent of jumping headfirst into Kessler Syndrome.

  • narc0tic_bird@lemm.ee
    link
    fedilink
    English
    arrow-up
    27
    ·
    4 hours ago

    Best case is that the model used to generate this content was originally trained by data from Wikipedia so it “just” generates a worse, hallucinated “variant” of the original information. Goes to show how stupid this idea is.

    Imagine this in a loop: AI trained by Wikipedia that then alters content on Wikipedia, which in turn gets picked up by the next model trained. It would just get worse and worse, similar to how converting the same video over and over again yields continuously worse results.

    • huginn@feddit.it
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 hour ago

      See also: model collapse

      (Which is more or less just regression towards the mean with more steps)

    • Wrench@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      2 hours ago

      Yes, this is what many of us worry will become the internet in general. AI content generated on from AI trained on AI garbage.

      AI bots can trivially outpace humans.

      • kboy101222@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        1 hour ago

        I was just discussing with a friend of mine how we’re rapidly approaching the dead internet. At some point, many websites will likely just be chat bots talking to other chat bots, which then gets used to train further chat bots. Human made content is already becoming harder and harder to find on algorithm heavy websites like Reddit and facebooks suite of sites. The bots can easily outpace any algorithmic changes they might make to help deter them, but my fb using family members all constantly block those weird Jesus accounts and they still show up constantly

  • schizo@forum.uncomfortable.business
    link
    fedilink
    English
    arrow-up
    156
    arrow-down
    6
    ·
    6 hours ago

    Further proof that humanity neither deserves nor is capable of having nice things.

    Who would set up an AI bot to shit all over the one remaining useful thing on the Internet, and why?

    I’m sure the answer is either ‘for the lulz’ or ‘late-stage capitalism’, but still: historically humans aren’t usually burning down libraries on purpose.

    • Wrench@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 hours ago

      Because basement losers can’t conquer and raze libraries to the ground.

      The internet has shown that assumed anonymity result in people fucking with other people’s lives for the hell of it. Viruses, trolling, etc. This is just the next stage of it because of a new easy to use tool.

    • Schmoo@slrpnk.net
      link
      fedilink
      English
      arrow-up
      61
      arrow-down
      3
      ·
      4 hours ago

      historically humans aren’t usually burning down libraries on purpose.

      How on earth have you come to this conclusion.

    • poszod@lemmy.world
      link
      fedilink
      English
      arrow-up
      73
      ·
      5 hours ago

      State actors could be interested in doing that. Same with the internet archive attacks.

    • endofline@lemmy.ca
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      4 hours ago

      It’s not about on purpose but usually most people don’t care about what’s not in their interest. Today interests are usually quite shallow what tiktok shows quite well. Libraries do require money for operating. Even internet archive and wikipedia

    • Petter1@lemm.ee
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      2
      ·
      edit-2
      2 hours ago

      Maybe a strange way of activism that is trying to poison new AI models 🤔

      Which would not work, since all tech giants have already archived preAI internet

      • schizo@forum.uncomfortable.business
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 hours ago

        Ah, so the AI version of the chewbacca defense.

        I have to wonder if intentionally shitting on LLMs with plausible nonsense is effective.

        Like, you watch for certain user agents and change what data you actually send the bot vs what a real human might see.

  • sbv@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    76
    ·
    5 hours ago

    As for why this is happening, the cleanup crew thinks there are three primary reasons.

    “[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,”

    That last one. Ouch.

    • TimLovesTech (AuDHD)(he/him)@badatbeing.social
      link
      fedilink
      English
      arrow-up
      18
      ·
      4 hours ago

      “[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,

      I think the main driver behind people misinformed about AI content comes from the fact that outside of tech people, most have no idea that AI will:

      1. 100% make up answers to things it doesn’t know because either the sample size of data they have ingested was to small or was bad. And it will do this with the same robot confidence you get for any other answer.

      2. AI that has been fed to much other AI generated content will begin to “hallucinate” and give some wild outputs, very similar to humans suffering from schizophrenia. And again these answers will be given as “fact” with the same robotic confidence.

    • BigDanishGuy@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      4 hours ago

      Well, I was in doubt, so I asked the AI whether I could trust the answers and it told me not to worry about it. That must mean that I only get accurate answers, right? /s

  • lolola@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    27
    arrow-down
    2
    ·
    5 hours ago

    I hate to post because I have loved and trusted Wikipedia for years, but the fact that there are folks out there who equally trust what AI tools generate just baffles me.

    • Bahnd Rollard@lemmy.world
      link
      fedilink
      English
      arrow-up
      24
      ·
      5 hours ago

      They used to be contained, every village has their idiot. Now that the internet is the global village, all the formerly isolated idiots have a place to chat.

      • sunzu2@thebrainbin.org
        link
        fedilink
        arrow-up
        5
        arrow-down
        2
        ·
        4 hours ago

        Amazing how these idiots are this effective…

        While us common folk can’t organize or agree on anything

        • Geobloke@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          ·
          3 hours ago

          Most of us do something idiotic once and when the opportunity to do it again, pull back and think "this was embarrassing last time, maybe I’ll re-evaluate. "

          But a dedicated idiotic is a different beast, fill of confidence and have had what ever organ produces shame surgically removed enabling them to commit ever greater acts of idiocy. But then the internet was invented and these people met. Some even had babies. And now there is arms race to see how many idiots can squeeze through the same tiny door. They have recognised their time to shine and seized it with their clammy yet also sticky hands.

          Truly, it’s inspiring in its own special way

  • Aatube@kbin.melroy.org
    link
    fedilink
    arrow-up
    14
    ·
    5 hours ago

    Don’t worry, it’s not as bad as the title suggests. The attack on Internet Archive is far, far worse. It’s obviously a bit of a problem, though.

  • RubberDuck@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    3
    ·
    5 hours ago

    Require someone that wants to add stuff to pay a small amount to the Wikimedia Foundation for activating their account and refund it if they moderate a certain amount.

    • aubertlone@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      5 hours ago

      Yeah I mean I’ve had minor edits reversed because I didn’t source the fact properly

      And that was like 10 years ago I’m surprised these edits are getting through in the first place

      • Shdwdrgn@mander.xyz
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        5 hours ago

        Seems like that would be an easy problem to solve… require all edits to have a peer review by someone with a minimum credibility before they go live. I can understand when Wikipedia was new, allowing anyone to post edits or new content helped them get going. But now? Why do they still allow any random person to post edits without a minimal amount of verification? Sure it self-corrects given enough time, but meanwhile what happens to all the people looking for factual information and finding trash?

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 hours ago

          Or at least give it a certain amount of time before it goes live. So if nobody comes around to approve it in 24 hours, it goes live.

          Usually bad edits are corrected within hours, if not minutes, so that should catch the lion’s share w/o bogging down the approval queue too much.

        • RubberDuck@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          3 hours ago

          Croudsourcing is the strenght that led to the vast resource and also the weakness as displayed here. So probably there will be a need for some form of barrier. Hence my suggestion.