Dream Machine

Glossary

Every term, in plain language — and a link to the place on this site where you can watch it happen instead of reading about it.

world model
A system that predicts what happens next, given what is and what you do. Your brain runs one when it catches a ball; Level 3's neural network is one you trained yourself.
See it live: Level 1 — where the definition lands
agent
Anything that acts: a person, a robot, a piece of software. The half of the loop that chooses.
See it live: the agent loop in Level 1
environment
Everything the agent doesn't control — the rest of the world, which answers every action with an observation.
See it live: the agent loop in Level 1
state
Every fact about the world that matters for what happens next. In Bounce it's five numbers: ball position, ball velocity, paddle position. Usually hidden; you see observations instead.
See it live: observation vs state in Level 1's Go deeper
observation
The glimpse of the world an agent actually receives — pixels, sensor readings, printed text. Partial, sometimes noisy, never the state itself.
See it live: the occlusion game (where the observation loses the ball)
action
What the agent does to the world. In Bounce: move the paddle left, stay, or right. 'Action-conditioned' models let your action change their prediction.
See it live: the action-conditioned filter in Level 5
latent state / latent space
The small code a model keeps after squeezing an observation through a bottleneck — what survives compression. Positions survive; texture dies.
See it live: the compression toy in Level 2
dynamics (model)
The function that carries state forward in time: (state, action) → next state. Real physics has one; a world model learns an imitation of it.
See it live: training the dynamics network in Level 3
experience buffer
The dataset gameplay writes: one (state, action, next state) row per timestep. Experience is training data, literally.
See it live: Stage 1 of Level 3
rollout (dream)
Running a model forward many steps on its own predictions — its step 2 computed from its step 1. Also called imagination, or a dream.
See it live: Stage 3 of Level 3
compounding error
The signature failure of free-running rollouts: each step's small mistake becomes the next step's input, so error grows with horizon — linearly at best, exponentially near sensitive events like bounces.
See it live: the failure lab in Level 4
MPC (model-predictive control)
Plan by imagining: simulate many candidate action sequences through a model, take the first action of the best one, replan constantly. Random-shooting MPC samples the candidates at random.
See it live: Stage 4 of Level 3
model-based reinforcement learning
The family of methods that learn a model of the world and use it — for planning or for training a policy on imagined experience — instead of learning from real trial-and-error alone. Sample-efficient because imagined practice is cheap.
See it live: Level 3's Go deeper
policy
A learned reflex: a function straight from state to action, no planning at decision time. The alternative (and complement) to MPC.
See it live: MPC vs learned policies in Level 3's Go deeper
renderer
A world model whose job is producing views — what a camera or an eye would see. Renders appearances; need not track consequences.
See it live: the taxonomy in Level 5
simulator
A world model whose job is evolving a world: given state and action, produce what happens next. The job your Level 3 network does for Bounce.
See it live: the taxonomy in Level 5
planner
A world model (or its user) whose job is choosing actions by evaluating predicted futures. Your Stage 4 paddle was driven by one.
See it live: the taxonomy in Level 5
POMDP
Partially Observable Markov Decision Process — the formal frame for the whole story: hidden state, observations, actions, transitions. The reason inner models are necessary at all.
See it live: Level 1's Go deeper
teacher forcing
Training a predictor only from real states, never from its own outputs. Great for learning; quietly unprepares the model for free-running dreams, where inputs are its own slightly-wrong predictions.
See it live: Level 4's Go deeper
divergence
This site's measure of how far a dream has drifted: the distance between the real and imagined ball positions after the same actions from the same start.
See it live: the divergence meter in Level 3