Nanochat Architecture

## What This Concept Is When you compare this course repo with `nanochat`, the goal is not to copy file layouts mechanically. The goal is to see how two serious LLM codebases break the same broad workflow into understandable layers. This note gives you that architectural map. ## Foundation Terms You Need First An **architecture map** is a high-level picture of how the main files and subsystems fit together. A **training layer** handles data, model definition, and optimization. A **serving layer** handles inference and user-facing requests. A **comparison repo** is an external codebase used as a reference point rather than a literal template. So read this note with one question in mind: where does each major idea from the course live in real code? ```mermaid flowchart TD A["Course notes and notebooks"] --> B["course_tools"] B --> C["picoLLM accelerated runtime"] C --> D["Train / eval / SFT / serve"] E["External systems reference"] --> F["nanochat"] G["External concepts reference"] --> H["rasbt/LLMs-from-scratch"] ``` > [!info] Course code > Use the companion repository for runnable notebooks, figures, and implementation references for this lecture: > - [notebooks/nanochat_architecture/lecture_walkthrough.ipynb](https://github.com/Montekkundan/llm/blob/main/notebooks/nanochat_architecture/lecture_walkthrough.ipynb) > - [README.md](https://github.com/Montekkundan/llm/blob/main/README.md) > - [picollm/README.md](https://github.com/Montekkundan/llm/blob/main/picollm/README.md) > - [picollm/accelerated/README.md](https://github.com/Montekkundan/llm/blob/main/picollm/accelerated/README.md) > - [picollm/accelerated/speedrun.sh](https://github.com/Montekkundan/llm/blob/main/picollm/accelerated/speedrun.sh) ## Why this comparison matters If you finish a concepts-first course and then get lost when you open a production-oriented repo, this comparison helps you see that larger systems are still made of recognizable layers.[^1][^2] In this course repo, those layers appear as: - `notebooks/` - `course_tools/` - `scripts/` - `picollm/accelerated/` - `apps/` In `nanochat`, the names differ, but the architectural logic is similar. ## Reading Rule The correct reading order is: 1. learn the idea in the course note 2. inspect the implementation in `picollm` 3. use `rasbt/LLMs-from-scratch` for alternative concept-first explanations 4. use `nanochat` to compare against a more optimized systems-oriented external repo So `nanochat` is not the main code path for the course. `picollm/accelerated` is. ## Responsibility mapping you should know The cleanest way to compare the two repos is by responsibility, not by folder name. - tokenizer logic: `picollm/accelerated/tokenizer.py` - model definition: `picollm/accelerated/gpt.py` - base pretraining: `picollm/accelerated/pretrain/train.py` - base evaluation: `picollm/accelerated/pretrain/eval.py` - chat [[Glossary#SFT|SFT]]: `picollm/accelerated/chat/sft.py` - chat evaluation: `picollm/accelerated/chat/eval.py` - one-command pipeline: `picollm/accelerated/speedrun.sh` That is the level on which this repo is now closest to `nanochat`.[^1] ## What To Show Live Use the notebook to compare: - the smaller runtime surfaces in this repo - the accelerated `picollm` path that most closely mirrors the main `nanochat` flow - the more product-oriented training and serving surfaces in `nanochat` > [!example] Notebook follow-up > - [notebooks/nanochat_architecture/lecture_walkthrough.ipynb](https://github.com/Montekkundan/llm/blob/main/notebooks/nanochat_architecture/lecture_walkthrough.ipynb) > Use the notebook here to walk the repo-responsibility comparison while this architecture map is still in view. The lesson is not that the repos are identical. The lesson is that once you understand the layers, a larger repo stops looking opaque. ## Where rasbt fits in this comparison One more orientation point helps alongside `nanochat`. - `rasbt/LLMs-from-scratch` is the cleanest concept-first external reference. - `picollm` is the course’s main implementation path. - `nanochat` is the systems-first comparison reference. State that three-way map explicitly so you do not confuse “reference repo” with “the repo I am expected to build in this course.” ## Relationship to the rest of the course Teach this after: - [[picollm Code Map]] - [[Real Chatbot Workflow]] At that point, you already understand the local code and can use the comparison to orient themselves in an external reference repo. ## Key takeaway `nanochat` is useful because it shows how the same LLM ideas appear in a more optimized, production-oriented architecture. `picollm/accelerated` is where you should look for the version used in this course repo.[^1][^2] > [!example] Notebook walkthroughs in this lecture > Use these companion notebook links as you read or review this lecture: > > - [notebooks/nanochat_architecture/lecture_walkthrough.ipynb](https://github.com/Montekkundan/llm/blob/main/notebooks/nanochat_architecture/lecture_walkthrough.ipynb) <div style="display:flex; gap:1rem; margin:1.5rem 0; flex-wrap:wrap;"> <div style="flex:1; min-width:220px; border:1px solid var(--background-modifier-border); border-radius:12px; padding:1rem; background:var(--background-secondary);"> <div style="font-size:0.85em; color:var(--text-muted); margin-bottom:0.35rem;">Previous</div> <div><a class="internal-link" data-href="picollm Code Map" href="picollm%20Code%20Map">picollm Code Map</a></div> </div> <div style="flex:1; min-width:220px; border:1px solid var(--background-modifier-border); border-radius:12px; padding:1rem; background:var(--background-secondary);"> <div style="font-size:0.85em; color:var(--text-muted); margin-bottom:0.35rem;">Next</div> <div><a class="internal-link" data-href="Vercel AI SDK Chat App" href="Vercel%20AI%20SDK%20Chat%20App">Vercel AI SDK Chat App</a></div> </div> </div> ## Further reading - Andrej Karpathy, "minbpe," 2025. https://github.com/karpathy/minbpe - Hugging Face, "Transformers documentation," 2025. https://huggingface.co/docs/transformers/index ## References [^1]: Andrej Karpathy, [nanochat](https://github.com/karpathy/nanochat) [^2]: Sebastian Raschka, [LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch)