JIT map — pull when stuck, not before¶
No study weeks. This is the index you hit after the wall, organized by the wall. Timebox any dive to 90 minutes, then get back to the ship. (Descendant of this repo's retired 13-track curriculum — the residue that survived contact with the 90-day plan.)
Directing AI to build (spec-driven dev & agentic loops)¶
- GitHub Spec Kit — spec-driven development made concrete: the spec is the source of truth, the code is regenerated. Steal the workflow.
- Anthropic — Claude Code best practices — explore/plan/code/commit and the test-driven agentic loop; the practical version of Week 0's two methods.
Prompting & context engineering¶
- Anthropic prompt engineering docs — the practical canon: system prompts, examples, chain-of-thought, long-context tips. Read the overview, then the page for your symptom.
- Anthropic — effective context engineering for agents — the successor discipline to prompt-wording: what to put in the window, what to cut, compaction, and why context is the real design surface.
- Prompting Guide — broad, provider-neutral reference with research citations; good second opinion.
Structured outputs & tool calling¶
- Claude tool use — schemas, forcing tool choice, parallel calls.
- Instructor — Pydantic validation + retries around any provider.
Evals (the moat)¶
- Hamel Husain — evals — the one to reread quarterly: error analysis, judges, looking at your data.
- Applied LLMs — a year of production lessons; evaluation and monitoring sections especially.
RAG & retrieval¶
- sentence-transformers — embeddings + rerankers, one library.
- Anthropic — contextual retrieval — chunk-context and hybrid+rerank, with benchmark deltas.
- Ragas — RAG metric definitions worth stealing even without the framework.
- Eugene Yan — LLM patterns — honest synthesis of retrieval/eval/guardrail patterns.
Agents & MCP¶
- Building effective agents — workflows vs agents; start simpler than you want to.
- Writing tools for agents — tool descriptions and output shaping; the highest-ROI agent read.
- MCP docs — the protocol and per-language server quickstarts.
- Lilian Weng — agents — the conceptual map (planning, memory, tools).
- Anthropic — multi-agent research system — when one agent isn't enough: orchestrator-worker, and the coordination cost that comes with it. Read before the multi-agent stretch.
Fine-tuning & open models¶
- Unsloth docs — free-tier-GPU QLoRA, working notebooks per model family.
- HF LLM course — the ecosystem end-to-end; dip by chapter, don't binge.
- HF PEFT — LoRA concepts + knobs.
- Karpathy — training recipe — debugging discipline for anything with a loss curve.
Local & serving¶
- Ollama docs — local models behind an API in minutes.
- vLLM docs — when you outgrow Ollama: real serving, batching, throughput. (Also the best place to understand KV cache and why long context costs what it costs.)
Multimodal¶
- Whisper / faster-whisper — open STT.
- Claude vision — image inputs, limits, image token costs.
- HF tasks — the map of open models per modality, with runnable examples.
Safety & security (know the failure modes you ship)¶
- Simon Willison — the lethal trifecta — untrusted input + private data + an outbound channel = an agent that can be turned against its owner. The one to internalize before wiring any tool.
- Simon Willison — prompt injection series — the clearest running account of the unsolved problem every agent inherits; read before giving any agent write-access to anything.
- OWASP Top 10 for LLM applications — the checklist to walk before each launch, especially the big swing.
Production & LLMOps (turning a demo into a service)¶
- Chip Huyen — building LLM applications for production — the systems view: latency, cost, monitoring, the gap between a notebook and a service.
- Langfuse docs — tracing, datasets, scores, and production monitoring; generous free tier.
Staying at the frontier (the daily 15 minutes)¶
Release notes and model cards over courses, always: Anthropic news · OpenAI news · HF blog · Simon Willison's weblog — the last one is the field's best running changelog, and the model for the write-ups you're publishing.