Pierce.dev API

Starting generation - "then" queries for next token prediction

Autoregressive token prediction (no caching):

"John"

[--]

"sat"

[--]

"then"

[0.31]

[0.92]

[0.56]

🔄Autoregressive Generation

• Context length: 2 tokens

• Query token: "then" computing attention over context

• No caching - vectors recomputed each generation step