# Minimal Attention Model Demo (Browser-Only) This project is a small in-browser demonstration of key components of a transformer-style attention mechanism. It runs entirely in JavaScript using ES modules. It includes: • Word embeddings • Positional encoding • Scaled dot-product attention • Softmax scoring • Simple training loop (cross-entropy loss) • Prediction of next token based on input context No third-party machine learning libraries are used. --- ## Files | File | Purpose | | ------------ | ------------------------------------ | | `index.html` | Basic UI output + script inclusion | | `real.js` | Full attention model implementation | | `Vector.js` | Basic vector operations | | `Matrix.js` | Basic dense matrix operations | | `server.js` | Minimal static HTTP server (Node.js) | --- ## Vocabulary The demo uses a tiny fixed vocabulary: ``` The, Cat, Sat, On, Mat, Bench, Book, Great, Is ``` Tokens are mapped to integer indices. --- ## Training Training data sequences: ``` ["The Book Is Great"] ["The Cat Sat On The Mat"] ["The Cat Sat On The Bench"] … ``` Each epoch loops over all sequences and performs: 1. Embedding lookup 2. Positional encoding added to embeddings 3. Query / Key / Value projections 4. Scaled dot-product attention 5. Weighted sum → logits → softmax probabilities 6. Cross-entropy loss + weight updates on: • Output projection matrix • Token embeddings The system prints intermediate progress into DOM elements. --- ## Output Once trained, the model prints predictions: ``` Next word after 'The Book Is': ... Next word after 'The Cat Sat': ... Next word after 'The Cat': ... ... ``` Predictions are appended to `.prediction` container in the page. --- ## How to Run ### 1 — Start the server From the folder containing `server.js` and the HTML/JS files: ```bash node server.js ``` Server will listen on: ``` http://localhost:1234 ``` ### 2 — Open the demo in a browser Navigate to: ``` http://localhost:1234 ``` The demo will: • Load embeddings • Run training loop • Display loss progression • Show final predictions --- ## Notes • This is a simplified demonstration intended for clarity, not accuracy • No batching, dropout, layer-norm, or multi-head attention • Update rules only modify embeddings + output projection (queries/keys/values not updated)