SPECULATIVE DECODING TERMINAL

Watch how a small draft model generates predictions that are validated by a larger target model, enabling faster text generation through parallel processing.

WITH SPECULATION

SPECULATIVE DECODING VISUALIZER

Token generation and validation flow

VALIDATED OUTPUT:

Step 1 of 25: Initial prompt loaded. Ready to begin generation.

The
future
of
AI
is
DRAFT MODEL (GPT-2 • 124M)AUTOREGRESSIVE • SEQUENTIAL
Waiting...
TARGET MODEL (GPT-3 • 175B)PARALLEL VALIDATION
Awaiting tokens...
BATCH SIZE
4
DRAFT TIME
0ms
TARGET TIME
0ms
TOTAL TIME
0ms

WITHOUT SPECULATION

TARGET MODEL ONLY

Sequential token generation without speculation

VALIDATED OUTPUT:

Step 1 of 17: Initial prompt loaded. Ready to begin generation.

The
future
of
AI
is
TARGET MODEL (GPT-3 • 175B)SEQUENTIAL GENERATION
Waiting...
TARGET TIME
0ms
TOTAL TIME
0ms