Roadmap - lectures

Use this page as the short navigation map of the course. For the full day-by-day release order, use [[How to Create a LLM from scratch and deploy it]]. ```mermaid flowchart TD A["Core concepts"] --> B["Training and evaluation"] B --> C["Runtime and deployment"] C --> D["Final chatbot workflow"] D --> E["Optional advanced track"] ``` ## Main Path Follow the main path in this order: 1. learn the concept in the note 2. open the notebook or smallest code surface 3. connect it to the relevant `picollm/accelerated` file 4. move to the product and deployment notes once the model path is clear ## Main Path Milestones 1. Understand the model’s input and representation layers. Notes: [[Tokenization]], [[Embedding Layer]], [[Positional Encoding]] 2. Understand the main Transformer operators and blocks. Notes: [[Scaled Dot-Product Attention]], [[Multi-head Attention]], [[Feed-Forward Network]], [[Layer Normalization]], [[Encoder Block]], [[Decoder Block]], [[Causal Language Modeling]] 3. Understand how the model is trained and measured. Notes: [[Training Loop]], [[Training Configuration and Hyperparameters]], [[Evaluation and Model Quality]], [[Experiment Tracking and Run Analysis]] 4. Understand inference, runtime, and system behavior. Notes: [[Inference and Sampling]], [[Inference Runtime and KV Cache]], [[Compute, Time, and Cost of LLMs]], [[Distributed Training and Multi-GPU]] 5. Understand data shaping and post-training. Notes: [[Data Curation and Dataset Quality]], [[Chat Format and SFT]], [[SFT Flow]] 6. Understand serving, deployment, and product clients. Notes: [[FastAPI Chat App]], [[Serving, Latency, and Observability]], [[Deployment]], [[Real Chatbot Workflow]], [[OpenTUI Terminal Chat App]], [[Vercel AI SDK Chat App]], [[picollm Code Map]] ## Optional Advanced Track Take the advanced track after the main path if you want the research and systems layer: - [[Scaling Laws and Compute-Optimal Training]] - [[Optimizer Theory for Transformer Training]] - [[Advanced Distributed Training Systems]] - [[Advanced Inference Systems]] - [[Formal Evaluation and Benchmarking]] - [[Reproducibility and Research Method]] - [[Post-Training Beyond SFT]] - [[Safety and Alignment Evaluation]] - [[Advanced Data Engineering for LLMs]] - [[Interpretability and Mechanistic Analysis]] ## Code Layers The runnable course material appears in three layers: - [notebooks/](https://github.com/Montekkundan/llm/tree/main/notebooks) for live walkthroughs - [course_tools/](https://github.com/Montekkundan/llm/tree/main/course_tools) for small concept-first runtimes - [picollm/accelerated/](https://github.com/Montekkundan/llm/tree/main/picollm/accelerated) for the serious [[Glossary#Tokenizer|tokenizer]], training, eval, and chat path Useful external comparisons: - [rasbt/LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch) - [nanochat](https://github.com/karpathy/nanochat)