Attention Visualizer

Explore how self-attention works in transformers. Enter a sentence, see how queries match keys, and observe how attention weights determine which tokens influence each output position.
Click a query token to see what it attends to:
Attention(Q, K, V) = softmax(QKT / √dk) V
Matrix
Bipartite
Statistics
Selected Token: -
Max Attention: -
Entropy: -
Scale Factor: -
Key Insight
Select a token to see attention patterns.