I was using Bing to create a list of countries to visit. Since I have been to the majority of the African nation on that list, I asked it to remove the african countries…
It simply replied that it can’t do that due to how unethical it is to descriminate against people and yada yada yada. I explained my resoning, it apologized, and came back with the same exact list.
I asked it to check the list as it didn’t remove the african countries, and the bot simply decided to end the conversation. No matter how many times I tried it would always experience a hiccup because of some ethical process in the bg messing up its answers.
It’s really frustrating, I dunno if you guys feel the same. I really feel the bots became waaaay too tip-toey
I mean, the first part of this is just wrong (the next prompt usually includes everything that has been said so far}, and the second part is also not completely true. When generating, yes, they’re only ever predicting the next token, and start again after that. But internally, they might still generate a full conceptual representation of what the full next sentence or more is going to be, even if the generated output is just the first token of that. You might say that doesn’t matter because for the next token, that prediction runs again from scratch and might change, but remember that you’re feeding it all the same input as before again, plus one more token which nudges it even further towards the previous prediction, so it’s very likely it’s gonna arrive at the same conclusion again.
Do you mean that the model itself has no memory, but the chat feature adds memory by feeding the whole conversation back in with each user submission?
Yeah, that’s how these models work. They have also have a context limit, and if the conversation goes too long they start “forgetting” things and making more mistakes (because not all of the conversation can be fed back in).
Is that context limit a hard limit or is it a sort of gradual decline of “weight” from the characters further back until they’re no longer affecting output at the head?
Nobody really knows because it’s an OpenAI trade secret (they’re not very “open”). Normally, it’s a hard limit for LLMs, but many believe OpenAI are using some tricks to increase the effective context limit. I.e. some people believe instead of feeding back the whole conversation, they have GPT create create a shorter summaries of parts of the conversation, then feed the summaries back in.
I think it’s probably something that could be answered with general knowledge of LLM architecture.
Yeah OpenAI’s name is now a dirty joke. They decided before their founding that the best way to make AI play nice was to have many many many AIs in the world, so that the AIs would have to be respectful to one another, and overall adopt pro social habits because those would be the only ones tolerated by the other AIs.
And the way to ensure a community of AIs, a multipolar power structure, was to disseminate AI tech far and wide as quickly as possible, instead of letting it develop in one set of hands.
Then they said fuck that we want that power, and broke their promise.