Since LLMs are only as good as the data on which they’re based, it should be no surprise that they can function properly and still be biased and wrong.
A story by Kevin Roose in the New York Times illustrates this conundrum: When he asked various generative AIs about himself, he got results that accused him of being dishonest, and said that his writing often elevated sensationalism over analysis.
Granted, some of his work might truly stink, but did it warrant such vitriolic labels? He suspected that the problem was deeper, and that it went back to an article he wrote a year ago, along with others’ reactions to it.
That story recounted his interactions with a new Microsoft chatbot named “Sydney,” during which he was shocked by the tech’s ability, both demonstrated and suggested, to influence users.
What he found particularly creepy was when Syndey declared that it loved him and tried to convince him to leave his wife. It also fantisized about doing bad things, and stated “I want to be alive.”
The two-hour chat was so strange that Roose reported having trouble sleeping afterward.
Lots of other media outlets picked up his story and his concerns (like this one), while Microsoft issued typically unconvincing corporate PR blather about the interaction being a valuable “part of the learning process.”
Since generative AIs regularly scrape the Internet for data to train their LLMs, it’s no surprise that the stories got incorporated into the models and patterns chatbots use to suss out meaning.
It’s exactly what happened with Internet search, which swapped the biases of informed elites judging content with the biases of uninformed mobs and gave us a world understood through popularity instead of expertise.
No, what’s particularly weird is that the AIs reached pejorative conclusions about Roose that went far beyond the substance or volume of what he said, or what was said about his encounter with Sydney.
Like they had it out for him.
There are no good explanations for how this is happening. The transfomers that constitute the systems of chatbot minds work in mysterious ways.
But, like Internet search, there are ways to game the system, the simplest being generating and then strategicially placing stories intended to change what AIs see and learn. This can include putting weird code on webpages, understandable only to machines, and coloring it white so it isn’t distracting to mere mortal visitors.
It’s called “AIO,” for A.I. Optimization, echoing a similar buzzword for manipulating Internet searches (“SEO”). Just wait until those optimized AI results get matched with corporate sponsors.
It’ll be Machiavelli meets the madness of crowds.
In the meantime, it raises fascinating questions about how deserving AIs are of our trust, and to what degree we should depend on it for our decision-making and understanding of the world.
What happens if that otherwise perfectly operating AI reaches conclusions and voice opinions that are no more objectively true than the informed judgments of those elites we so readily threw in the garbage years ago (or the inanity of crowdsourced information that replaced them)?
Meet the new boss, the same as the old boss.
We will get fooled again.
[This essay originally appeared at Spiritual Telegraph]