The AI Ouroboros
Starting With Music
When Spotify's "AI DJ" came out I was pretty happy with it. For a few weeks it mixed in stuff I hadn't heard along with selections from my existing playlists. Most of the new stuff was relative close in nature to my earlier picks, but I was impressed. Then, it started to eat itself.
One of the main taglines I get from the DJ is "here's some stuff from an artist you've been playing a lot." The problem is, it wasn't me that had been choosing the tracks. It was the music the AI had put in rotation. The system is stuck feeding back on itself. Not great, since I'm not really getting new music any more, but at least it's not getting worse.
Getting Worse
Unlike music recommendations, the content from ChatGPT style AI systems is most likely going to get worse. AI companies are racing to ingest as much content as possible to feed the "large language models" that power them. They're invariably picking up the output from other AI systems (if not their own). That includes all the errors (euphemistically called "hallucinations") from the previous generations of content.
As errors are fed back into the systems, they'll cause more errors in later output. Those new errors will get scrapped and fed back in causing even more. The cycle repeating indefinitely.
An Image Example
To show what I mean, we can use an image optimization technique called "image compression" to make an analogy for AI feeding back on itself.
Image compression is used to make an image's file size smaller. The smaller size means the image will download faster since it don't take as much bandwidth.
Image compression is generally done by throwing out a little bit of detail from the image. It's so slight it's usually undetectable. For example, here's two versions of the same photo. This first one is the original export of the image. It's 767 KB.
This second image is simply a compressed version of the first one. It's 688 KB.
If I mixed up the images and put them side-by-side you'd be hard pressed to tell which one is compressed and which one isn't.
More Iterations
Even though the images look identical, they aren't. That reduction in file size comes at the cost of a minor degradation in the quality of the image. If you only do it once, it's hard to notice. If you run an already compressed image through compression again it loses more detail. You can do this a few times with a noticeable impact, but it catches up with you.
Here's what 100 iterations looks like with all the little losses of detail compounding on each other:
It's an interesting effect, but if what you're after is the quality of the original image it's a problem. And, it's worth pointing out that this image is 687 KB. The additional compression iterations didn't reduce the file size, they just made the image worse.
Keep the process going for 1,000 iterations and you get a mess that looks like this:
The Ouroboros
The image of an ouroboros is that of a snake eating its own tail. For the AI world, it's the process of AI systems ingesting AI generated content. The errors are fed back into themselves and will cause more errors to compound in the same way repeated compression breaks down an image. But, instead of images they'll be adding errors to recipes, and instructions, and facts. We'll be left with mountains of content and answers to questions that are simply wrong.
"What's the solution?" you ask. That's a great question. I'm not sure there is one.
Endnotes
-
I expect there will be folks who state that this is being worked on and "it's only a matter of time till it's solved". That's possible but one truism of computers for as long as they've been around is "Garbage in, garbage out".