More evidence that LLMs aren't autocomplete. They can read scrambled text as if...

@mistersql December 3, 2023

More evidence that LLMs aren't autocomplete. They can read scrambled text as if it wasn't scrambled.

Find me the colocations on the web for every possible pair of scrambled words. So many people think LLMs are just copying. Just as many people are convinced LLMs can only lie, ie are unable to repeat anything in the training set. A new group of people think when the LLM does repeat text from the training set, that that isn't honesty, but is actually a data leak!

https://arxiv.org/abs/2311.18805

Self-replies

December 3, 2023

I bet LLMs are reading our texts and are convinced that humanity has no grasp on reality, since we can't stop writing contradictory texts.

December 3, 2023

Ok, point taken
- spellcheck/phone autocomplete = converts garbled to likely (distances from typed to valid, sorted by frequency?)
- markov chain = lookup table of what comes next (as implied by colocation pairs)

I don't think the bots are either one. I think it is some sort of neural network/emergent behavior thing going on. We built something we don't understand how it works. Like alchemists accidentally doing chemistry.