Pierce.dev API

Starting generation - "then" queries for next token prediction

Autoregressive token prediction (with cache):

"John"

[--]

"sat"

[--]

"then"

[0.31]

[0.92]

[0.56]

⚡Autoregressive Generation with KV Cache

• Context length: 2 tokens

• Query token: "then" computing attention over context

• KV caching enabled - vectors computed once and reused!