Catastrophic Forgetting
Tap to boot the learner
How Catastrophic Forgetting Works
This is continual learning, the hard kind. You are a small neural net being taught one little task after another: a glyph appears, and you must map it to the correct response. Easy with one task. Then we give you a second task. Then a third. The new gradients quietly overwrite the old weights. This is the stability–plasticity dilemma, and you are about to lose it.
- Learn a fresh task: we show you a glyph and its correct response, then drill you until you're fluent.
- Once fluent, a new task is added — and from then on, prompts are interleaved across every task you've learned.
- Each task has a memory strength. Answering it correctly refreshes it; answering other tasks lets it decay (interference). Wrong answers damage it further.
- Rehearse (R) to re-view an old, fading mapping and shore it up — but you only get a few, and each one is a turn you didn't spend learning.
- Recall errors burn Coherence. Hit zero and the model collapses into incoherent slop.
How To Score Well
The headline number is tasks retained above threshold — how many mappings you kept alive at once. Watch the memory shelf: when a bar dims toward red, that task is slipping. Spend a precious rehearsal, or accept the loss and protect your stronger memories. There is no free lunch and there is definitely no free replay buffer.
Slop Fact: Real networks forget Task A almost completely after training on Task B — McCloskey & Cohen named it "catastrophic interference" in 1989. The industry's fix was to just keep all the old data forever and retrain on everything, which is less "remembering" and more "hoarding." You, sadly, do not get the old data forever.