Predict Yourself
How Predict Yourself Works
This is a calibration eval for the most overfit model in the room: you. First you build a tiny preference dataset by answering fast this-or-that questions. Then we delete the labels and ask you to reconstruct them from memory. Your reward is how well your self-model matches your actual behavior.
- Phase 1 — Choose: tap one of two options on each preference question. Don't overthink. We're logging everything.
- Phase 2 — Predict: we re-present each question and ask which you picked. Tap your recalled answer.
- Every correct self-prediction is a point. Match all of them for a perfect self-model.
- Beat your best self-model accuracy. The score lives in your browser, not the cloud.
Why Is This Hard?
Some questions are deliberately forgettable — low salience, near-coin-flip choices that your memory never bothered to encode. That's the latent space being honest: you don't actually store most of your own decisions. You confabulate them later, confidently, like a language model filling a gap with the most probable token.
Slop Fact: Humans reliably overestimate their self-prediction accuracy — a bias so consistent it's basically RLHF for your ego. If you score below 60%, congratulations: you are less self-aware than a 7B parameter model that has never met you.