- cross-posted to:
- technology@lemmy.world
- cross-posted to:
- technology@lemmy.world
You must log in or # to comment.
Is there a slide deck or transcript of this? I don’t watch videos this way.
I’m still reading the machine generated transcript of the video. But to keep it short:
The author was messing with ISBNs (international standard book numbers), and noticed invalid ones fell into three categories.
- Typos and similar.
- Publishers assigning an invalid ISBN to the book, because they didn’t get how ISBNs work.
- References "hallucinated"¹ by ChatGPT, that do not match any actual ISBN.
He then uses this to highlight that Wikipedia is already infested by bullshit from large “language” models², and this creates a bunch of vicious cycles that go against the spirit of Wikipedia of reliability, factuality, etc.
Then, if I got this right, he lays out four hypotheses (“theories”) on why people do this³:
- People who ignore the limitations of those models
- People seeking external help to contribute with Wikipedia
- People using chatbots to circumvent frustrating parts of doing something
- People with an agenda.
Notes (all from my/Lvxferre’s part; none of those is said by the author himself)
- “Hallucination”: misleading label used to refer to output that has been generated the exact same way as the rest of the output, but when interpreted by humans it leads to bullshit.
- I have a rant about calling those models “language” models, but to keep it short: I think “large token models” would be more accurate.
- In my opinion, the author is going the wrong way here. Disregard intentions, focus on effect — don’t assume good faith, don’t assume any faith at all. Instead focus on the user behaviour; if they violate Wikipedia policies once warn them, if they keep doing it remove them as dead weight fighting against the spirit of the project.



