Project · LLMOps
LLMOps Curriculum
A structured path through the operational side of LLM systems — observability, eval discipline, cost routing, model gateways. Each phase ends at a working commit, not a quiz.
Certifications
-
Claude Certified Architect Foundations
Agentic architecture, MCP tool design, Claude Code workflows, context management. ~70% of exam surface overlaps existing daily practice.
-
AWS AI Practitioner (AIF-C01)
Foundational GenAI-on-AWS vocabulary. Pre-SAA warmup that fills the Bedrock/SageMaker gap SAA skims over.
-
AWS Solutions Architect Associate (SAA-C03)
Locked legibility cert. Standalone study (Cantrill + Tutorials Dojo); curriculum Phase 7 deploys provide hands-on.
41 tasks · 13 with commit evidence
- Phase 0 You've already done more than you think pending2 tasks
- Reframe existing work in LLMOps vocabulary
- Flag the one gap that feels embarrassing and write why
-
- Phase 1 The 4-layer mental model pending3 tasks
- Read [Applied LLMs sections 1 + 3](https://applied-llms.org/)
- Skim [ZenML's 457 case study roundup](https://www.zenml.io/blog/llmops-in-production-457-case-studies-of-what-actually-works)
- Build the 4-layer tool-slotting table
-
- Phase 2 The retrofit in progress4/10 tasks with evidence
- Stand up self-hosted Langfuse
- Install the three deps in reasonable-ux venv
- Find and wrap Claude-call entry points through LiteLLM
- Add Langfuse instrumentation
- Wrap scoring outputs in Instructor
- Document the stack in reasonable-ux CLAUDE.md
- Run /session-review before commit
- Copy the three-dep install + LiteLLM wrap from reasonable-ux
- Point Langfuse at a second project in the same self-hosted instance
- Instructor for the multi-model scoring output
-
- Phase 3 Eval discipline in progress3/5 tasks with evidence
- Read [Hamel Husain — Your AI Product Needs Evals](https://hamel.dev/blog/posts/evals/)
- Assemble a 20-example golden set for reasonable-ux
- Write Level-1 deterministic assertions as pytest
- Wire eval run into reasonable-ux CI (or local pre-push)
- Optional — [Jason Liu 6 RAG evals](https://jxnl.co/writing/2025/05/19/there-are-only-6-rag-evals/)
-
- Phase 4 Routing + cost strategy in progress1/7 tasks with evidence
- Read [Anthropic prompt caching docs](https://platform.claude.com/docs/en/build-with-claude/prompt-caching)
- Scan [LiteLLM router docs](https://docs.litellm.ai/docs/routing)
- Configure a fallback chain in reasonable-ux
- Enable split cache-token telemetry
- Create a LiteLLM virtual key with a hard budget
- Run the Haiku-first/Sonnet-escalate experiment
- Capture the win (or loss) in a vault note
-
- Phase 5 LangSmith at work / Langfuse at home (ongoing, not sequential) pending3 tasks
- Scaffold the comparison note on day one of LangSmith onboarding at work
- Fill one row per week as work surfaces each feature
- Hard-boundary check every session
-
- Phase 6 Public artifacts (the marketability layer) pending5 tasks
- Gate — 2 weeks of real Langfuse data on reasonable-ux
- Pull the three hero numbers from Langfuse
- Draft writeup #1 — "A solo dev's minimum-viable LLMOps stack"
- Publish to portfolio-site
- Share on X under @reasonequals
-
- Phase 7 Adjacent platform engineering (parallel track) deferred6 tasks
- Take AWS AI Practitioner (AIF-C01) as the warmup
- VPC / subnets / security groups
- IAM + secrets
- Serverless deployment — Lambda + API Gateway
- Container deployment — ECS Fargate
- AWS SAA (SAA-C03)
-