2.4 KiB
Minimal Attention Model Demo (Browser-Only)
This project is a small in-browser demonstration of key components of a transformer-style attention mechanism. It runs entirely in JavaScript using ES modules.
It includes:
• Word embeddings • Positional encoding • Scaled dot-product attention • Softmax scoring • Simple training loop (cross-entropy loss) • Prediction of next token based on input context
No third-party machine learning libraries are used.
Files
| File | Purpose |
|---|---|
index.html |
Basic UI output + script inclusion |
real.js |
Full attention model implementation |
Vector.js |
Basic vector operations |
Matrix.js |
Basic dense matrix operations |
server.js |
Minimal static HTTP server (Node.js) |
Vocabulary
The demo uses a tiny fixed vocabulary:
The, Cat, Sat, On, Mat, Bench, Book, Great, Is
Tokens are mapped to integer indices.
Training
Training data sequences:
["The Book Is Great"]
["The Cat Sat On The Mat"]
["The Cat Sat On The Bench"]
…
Each epoch loops over all sequences and performs:
- Embedding lookup
- Positional encoding added to embeddings
- Query / Key / Value projections
- Scaled dot-product attention
- Weighted sum → logits → softmax probabilities
- Cross-entropy loss + weight updates on: • Output projection matrix • Token embeddings
The system prints intermediate progress into DOM elements.
Output
Once trained, the model prints predictions:
Next word after 'The Book Is': ...
Next word after 'The Cat Sat': ...
Next word after 'The Cat': ...
...
Predictions are appended to .prediction container in the page.
How to Run
1 — Start the server
From the folder containing server.js and the HTML/JS files:
node server.js
Server will listen on:
http://localhost:1234
2 — Open the demo in a browser
Navigate to:
http://localhost:1234
The demo will:
• Load embeddings • Run training loop • Display loss progression • Show final predictions
Notes
• This is a simplified demonstration intended for clarity, not accuracy • No batching, dropout, layer-norm, or multi-head attention • Update rules only modify embeddings + output projection (queries/keys/values not updated)