A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

assassin_aragorn@lemmy.world · 1 year ago

A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

orclev@lemmy.world · 1 year ago

They know how it works. It’s a statistical model. Given a sequence of words, there’s a set of probabilities for what the next word will be. That’s the problem, an LLM doesn’t “know” anything. It’s not a collection of facts. It’s like a pachinko machine where each peg in the machine is a word. The prompt you give it determines where/how the ball gets dropped in and all the pins it hits on the way down corresponds to the output. How those pins get labeled is the learning process. Once that’s done there really isn’t any going back. You can’t unscramble that egg to pick out one piece of the training data.

garyyo@lemmy.world · 1 year ago

While you are overall correct, there is still a sort of “black box” effect going on. While we understand the mechanics of how the network architecture works the actual information encoded by training is, as you have said, not stored in a way that is easily accessible or editable by a human.

I am not sure if this is what OP meant by it, but it kinda fits and I wanted to add a bit of clarification. Relatedly, the easiest way to uncook (or unscramble) an egg is to feed it to a chicken, which amounts to basically retraining a model.

DharkStare@lemmy.world · 1 year ago

I really like that pachinko analogy. It gets the basic concept across without having to wade into technical descriptions.

LittleLordLimerick@lemm.ee · 1 year ago

It’s a statistical model. Given a sequence of words, there’s a set of probabilities for what the next word will be.

That is a gross oversimplification. LLM’s operate on much more than just statistical probabilities. It’s true that they predict the next word based on probabilities learned from training datasets, but they also have layers of transformers to process the context provided from a prompt to eke out meaningful relationships between words and phrases.

For example: Imagine you give an LLM the prompt, “Dumbledore went to the store to get ice cream and passed his friend Sam along the way. At the store, he got chocolate ice cream.” Now, if you ask the model, “who got chocolate ice cream from the store?” it doesn’t just blindly rely on statistical likelihood. There’s no way you could argue that “Dumbledore” is a statistically likely word to follow the text “who got chocolate ice cream from the store?” Instead, it uses its understanding of the specific context to determine that “Dumbledore” is the one who got chocolate ice cream from the store.

So, it’s not just statistical probabilities; the models’ have an ability to comprehend context and generate meaningful responses based on that context.

theneverfox@pawb.social · 1 year ago

This is mostly true, except they do store information - it’s just not in a consistent, machine readable form.

You can analyze it with specialized tools, and an expert can gain some ability to understand what is stored in a specific link and manually modify it (in a very blunt way)

Scrambling an egg is a good analogy to a point - you can’t extract out the training data. It’s essentially extremely high, loss full compression from an informational perspective.

You can’t get the egg back, but you can modify the model to change the information inside of it. It’s extremely complex, but it’s a very active field of study - with simpler models we’ve been able to separate data out from ability - the idea is to use something closer to a database that can be modified without doing brain surgery every time. It’s

You can’t guarantee destruction of information without complete understanding of the model, but we might be able to scramble personal details… Granted, it’s not like we can do now