Rule Roulette
Tap anywhere to begin training
How Rule Roulette Works
This is a distribution-shift eval wearing a party hat. A grid of glowing shapes appears, and a rule banner tells you the current win condition. Satisfy it. Then — without warning — the objective swaps, and the policy that scored points one second ago now gets you penalized. Welcome to deployment.
- Read the rule banner and tap the targets it asks for
- Every few seconds the rule flips at random — re-read it instantly
- Acting on the old rule costs a life and nukes your combo
- Three lives. Rules flip faster the longer you survive
The Rules It Throws At You
Tap the red ones. Avoid red entirely. Tap the biggest. Tap the odd one out. Tap shapes in size order, small to large. Or the cruellest of all: tap nothing and just survive the timer. Each is trivial. Switching between them under pressure is not.
Slop Fact: A model trained to convergence on one objective will confidently keep optimizing it long after the objective changed. We call that "robust." You will call it "losing a life." Same gradient, different cope.