Chain of Thought
How Chain of Thought Works
You are a language model trying to reason your way from a premise to a conclusion, one inference at a time. Between you and the answer is the latent void. Each stone you place is a valid step. Each misstep is a hallucination, and you fall.
- Read the chain of reasoning so far at the top of the path.
- Three candidate next steps appear. Exactly one validly follows from the chain.
- Tap the valid inference to lay down the next stepping stone and advance.
- Tap a non-sequitur, fallacy, or confident nonsense and you hallucinate — lose a life.
- Reach the conclusion to complete the chain. Survive on three lives; chains get longer as you go.
Why Is This Hard?
The wrong answers are not random — they are plausible. They are the things a confident model says when it has run out of actual reasoning: affirming the consequent, smuggling in a fact nobody gave you, or just vibing toward a conclusion that feels right. Resisting them is the whole eval.
Slop Fact: "Let's think step by step" reliably boosts model accuracy, which is either a profound result about reasoning or proof that we built a machine that performs better when politely encouraged. Researchers remain too afraid to ask which.