Glossary

Every term, in plain language — and a link to the place on this site where you can watch it happen instead of reading about it.

world model: A system that predicts what happens next, given what is and what you do. Your brain runs one when it catches a ball; Level 3's neural network is one you trained yourself.; See it live: Level 1 — where the definition lands
agent: Anything that acts: a person, a robot, a piece of software. The half of the loop that chooses.; See it live: the agent loop in Level 1
environment: Everything the agent doesn't control — the rest of the world, which answers every action with an observation.; See it live: the agent loop in Level 1
state: Every fact about the world that matters for what happens next. In Bounce it's five numbers: ball position, ball velocity, paddle position. Usually hidden; you see observations instead.; See it live: observation vs state in Level 1's Go deeper
observation: The glimpse of the world an agent actually receives — pixels, sensor readings, printed text. Partial, sometimes noisy, never the state itself.; See it live: the occlusion game (where the observation loses the ball)
action: What the agent does to the world. In Bounce: move the paddle left, stay, or right. 'Action-conditioned' models let your action change their prediction.; See it live: the action-conditioned filter in Level 5
latent state / latent space: The small code a model keeps after squeezing an observation through a bottleneck — what survives compression. Positions survive; texture dies.; See it live: the compression toy in Level 2
dynamics (model): The function that carries state forward in time: (state, action) → next state. Real physics has one; a world model learns an imitation of it.; See it live: training the dynamics network in Level 3
experience buffer: The dataset gameplay writes: one (state, action, next state) row per timestep. Experience is training data, literally.; See it live: Stage 1 of Level 3
rollout (dream): Running a model forward many steps on its own predictions — its step 2 computed from its step 1. Also called imagination, or a dream.; See it live: Stage 3 of Level 3
compounding error: The signature failure of free-running rollouts: each step's small mistake becomes the next step's input, so error grows with horizon — linearly at best, exponentially near sensitive events like bounces.; See it live: the failure lab in Level 4
MPC (model-predictive control): Plan by imagining: simulate many candidate action sequences through a model, take the first action of the best one, replan constantly. Random-shooting MPC samples the candidates at random.; See it live: Stage 4 of Level 3
model-based reinforcement learning: The family of methods that learn a model of the world and use it — for planning or for training a policy on imagined experience — instead of learning from real trial-and-error alone. Sample-efficient because imagined practice is cheap.; See it live: Level 3's Go deeper
policy: A learned reflex: a function straight from state to action, no planning at decision time. The alternative (and complement) to MPC.; See it live: MPC vs learned policies in Level 3's Go deeper
renderer: A world model whose job is producing views — what a camera or an eye would see. Renders appearances; need not track consequences.; See it live: the taxonomy in Level 5
simulator: A world model whose job is evolving a world: given state and action, produce what happens next. The job your Level 3 network does for Bounce.; See it live: the taxonomy in Level 5
planner: A world model (or its user) whose job is choosing actions by evaluating predicted futures. Your Stage 4 paddle was driven by one.; See it live: the taxonomy in Level 5
POMDP: Partially Observable Markov Decision Process — the formal frame for the whole story: hidden state, observations, actions, transitions. The reason inner models are necessary at all.; See it live: Level 1's Go deeper
teacher forcing: Training a predictor only from real states, never from its own outputs. Great for learning; quietly unprepares the model for free-running dreams, where inputs are its own slightly-wrong predictions.; See it live: Level 4's Go deeper
divergence: This site's measure of how far a dream has drifted: the distance between the real and imagined ball positions after the same actions from the same start.; See it live: the divergence meter in Level 3