The Course-as-Toolkit Format

A reusable spec for any 'X from scratch' curriculum, modeled on AI Engineering from Scratch

10 minApr 27, 2026

A spec for building Claude-native learning repos that produce a toolkit instead of a certificate. The shape is borrowed from rohitg00/ai-engineering-from-scratch and stripped down to the parts that survive when you swap the subject matter — RAG, data analysis, distributed systems, financial modeling, anything you can teach by building.

This article does not teach a topic. It describes the format so you can pour your own books and resources into it.

The one bet the format makes

Every lesson ships a reusable artifact — a prompt, a skill, an agent, or an MCP server. The course's deliverable is the cumulative outputs/ directory, not a final exam. If your version of this format breaks that bet, you're building a different thing.

Two consequences:

Lessons end with a thing you can install, not "you have learned X."
The toolkit grows monotonically. Phase 18 builds on artifacts shipped in Phase 2.

Anything that doesn't serve those two consequences is incidental and can be changed.

Top-level repo layout

<repo-name>/
├── .claude/
│   └── skills/
│       ├── find-your-level/SKILL.md
│       └── check-understanding/SKILL.md
├── phases/
│   └── XX-phase-name/
│       └── NN-lesson-name/        # see "The lesson unit" below
├── outputs/                       # aggregated toolkit
│   ├── prompts/
│   ├── skills/
│   ├── agents/
│   └── mcp-servers/
├── glossary/
│   └── terms.md
├── scripts/                       # automation, doc generators
├── site/  web/                    # optional public-facing surface
├── LESSON_TEMPLATE.md             # the authoritative spec
├── ROADMAP.md                     # phase list with status indicators
├── CONTRIBUTING.md
├── FORKING.md
├── LICENSE                        # MIT, so the format itself can be forked
└── README.md

Two of these folders carry the Claude integration:

.claude/skills/ — small and stable. Two skills that drive the course itself, not the subject matter.
outputs/ — large and growing. Every lesson contributes here. By the end the learner has a personal, installable toolkit.

The split matters: course skills are the scaffolding; learner artifacts are the deliverable.

The lesson unit

The unit of work is a lesson directory. Same shape every time:

phases/XX-phase-name/NN-lesson-name/
├── code/           # main.py, main.ts, main.rs, main.jl — runnable, comment-free
├── notebook/
│   └── lesson.ipynb
├── docs/
│   └── en.md       # canonical explanation; translations sit alongside
└── outputs/        # prompt-*.md, skill-*.md, agent-*.md, mcp-server-*.md

XX and NN are zero-padded. Phase folder names use kebab-case. Lesson folder names too. No spaces, no underscores.

The 6-step lesson lifecycle

Every docs/en.md follows the same six headings, in the same order:

Step	Purpose	What it must contain
Motto	One-line core idea	A sentence the learner could quote a year later
Problem	Why not knowing this hurts	A concrete scenario, not an abstract motivation
Concept	Mental model + diagram	Mermaid or hand-drawn; no code yet
Build It	From-scratch implementation	Pure language primitives. No frameworks.
Use It	Same thing with a real library	Map every from-scratch piece to its framework counterpart
Ship It	The artifact this lesson produces	Path to the file under `outputs/`

The Build-It / Use-It split is where the format's pedagogy lives. Build-It gives the learner the right to use the framework; Use-It gives them the productivity. Skipping Build-It produces framework users; skipping Use-It produces from-scratch hobbyists. Keep both.

The two built-in skills

These are the only skills the repo ships. They're not subject-matter skills — those live under outputs/skills/ and are produced by the learner.

`/find-your-level` — placement

A short quiz (the original uses 10 questions) that maps the learner's prior knowledge to a starting phase, then prints a personalized path with hour estimates.

SKILL.md should:

Define the question pool. Each question tags the phase(s) it gates.
Define a scoring rule that returns a phase number.
Print: starting phase, recommended phases to skip, estimated hours.

`/check-understanding <phase>` — per-phase checkpoint

A shorter quiz (8 questions in the original) for one phase. Returns a pass/fail plus a list of specific lessons to review on miss.

SKILL.md should:

Take a phase number as argument.
Pull questions from a per-phase pool (one file per phase under the skill's data directory).
On miss, cite the lesson path that teaches the missed concept.

These two skills replace what would otherwise be a placement test buried in a README and a self-assessment buried in a footer. Skills move both to where the learner already is — inside Claude Code.

The outputs/ namespace

Filenames declare install target. Pick the right prefix and the artifact is self-routing.

Prefix	Install target	Example
`prompt-*.md`	Paste into any chat assistant	`prompt-debugger.md`
`skill-*.md`	Drop into `.claude/skills/` (or Cursor / Codex equivalent)	`skill-probability-reasoning.md`
`agent-*.md`	Long-running autonomous worker	`agent-doc-summarizer.md`
`mcp-server-*.{py,ts}`	MCP-compatible host	`mcp-server-vector-search.py`

The repo-level outputs/<type>/ directories are populated by symlinks or copies from the per-lesson outputs/. The aggregator script lives in scripts/.

Authoring conventions (the rules that make forks survive)

Five rules you don't customize:

Code is comment-free. Inline comments rot. Explanation belongs in docs/en.md. This is enforced at review.
docs/en.md is canonical. Translations sit beside it (docs/ko.md, docs/es.md); en.md is the source of truth when they diverge.
LESSON_TEMPLATE.md is the spec. New lessons copy it verbatim and fill the six headings. Don't paraphrase the headings.
ROADMAP.md uses three status markers. ✅ complete, 🚧 in progress, ⬚ planned. Anything else fragments the dashboard.
MIT-licensed with FORKING.md. The structural pattern is the deliverable; permissive licensing makes that real.

These five are the actual reusable kernel. Drop any of them and the format starts collapsing into a generic course.

Mapping source material into the format

You said you have books and resources. Here is how they map.

A book typically becomes one phase or a small group of phases. Chapters do not map one-to-one to lessons — chapters are too long and uneven. Instead:

Read the chapter and extract the concrete things you'd want to be able to do afterwards. Each one is a candidate lesson.
Filter: keep only the ones where Build-It is meaningful. If the only honest "Build It" is pip install x, the concept doesn't deserve a lesson — it deserves a paragraph in another lesson's docs.
The book becomes a citation in docs/en.md ("Concept" section) and never appears as a substitute for the from-scratch implementation.

A paper becomes either one Build-It (reimplement the central algorithm) or one Concept (cite + explain) inside a lesson, never a phase.

A video / blog post becomes a side note in docs/en.md — useful when the learner gets stuck on intuition, never load-bearing.

Glossary entries are extracted as you go: every term that appears in two or more lessons' docs/en.md graduates to glossary/terms.md.

Minimum viable seed

You don't need 20 phases on day one. The smallest version of this format that still works:

<repo>/
├── .claude/skills/find-your-level/SKILL.md   # even if it just returns "Phase 1"
├── phases/01-<topic>/01-<first-lesson>/      # one lesson, six headings, one artifact
│   ├── code/
│   ├── docs/en.md
│   └── outputs/prompt-<name>.md
├── outputs/prompts/                          # symlink/copy from above
├── LESSON_TEMPLATE.md
├── ROADMAP.md                                # one line: ⬚ Phase 1
└── README.md

Ship that. Then the next lesson. Then /check-understanding once Phase 1 has 3+ lessons. Then Phase 2. Skills get refined as the question bank grows.

The trap is treating the 20-phase original as a prerequisite. It isn't — it's an end state. The format works at one phase, one lesson, one artifact.

What you'll customize per repo

The phases list and their order
The languages used in code/ (the original picks four; one is fine)
The exact question banks for the two skills
The glossary
The toolkit's domain — prompts about RAG vs prompts about distributed systems vs prompts about board game design

What you won't customize: the per-lesson layout, the 6-step lifecycle, the two skills' purpose, the outputs/ naming convention, the five authoring rules.

If you're considering customizing one of those, you're considering a different format. That's fine — but name it differently so future readers know which spec they're inheriting from.

The one bet the format makes

Two consequences:

Lessons end with a thing you can install, not "you have learned X."
The toolkit grows monotonically. Phase 18 builds on artifacts shipped in Phase 2.

Anything that doesn't serve those two consequences is incidental and can be changed.

Top-level repo layout

<repo-name>/
├── .claude/
│   └── skills/
│       ├── find-your-level/SKILL.md
│       └── check-understanding/SKILL.md
├── phases/
│   └── XX-phase-name/
│       └── NN-lesson-name/        # see "The lesson unit" below
├── outputs/                       # aggregated toolkit
│   ├── prompts/
│   ├── skills/
│   ├── agents/
│   └── mcp-servers/
├── glossary/
│   └── terms.md
├── scripts/                       # automation, doc generators
├── site/  web/                    # optional public-facing surface
├── LESSON_TEMPLATE.md             # the authoritative spec
├── ROADMAP.md                     # phase list with status indicators
├── CONTRIBUTING.md
├── FORKING.md
├── LICENSE                        # MIT, so the format itself can be forked
└── README.md

Two of these folders carry the Claude integration:

.claude/skills/ — small and stable. Two skills that drive the course itself, not the subject matter.
outputs/ — large and growing. Every lesson contributes here. By the end the learner has a personal, installable toolkit.

The split matters: course skills are the scaffolding; learner artifacts are the deliverable.

The lesson unit

The unit of work is a lesson directory. Same shape every time:

phases/XX-phase-name/NN-lesson-name/
├── code/           # main.py, main.ts, main.rs, main.jl — runnable, comment-free
├── notebook/
│   └── lesson.ipynb
├── docs/
│   └── en.md       # canonical explanation; translations sit alongside
└── outputs/        # prompt-*.md, skill-*.md, agent-*.md, mcp-server-*.md

XX and NN are zero-padded. Phase folder names use kebab-case. Lesson folder names too. No spaces, no underscores.

The 6-step lesson lifecycle

Every docs/en.md follows the same six headings, in the same order:

Step	Purpose	What it must contain
Motto	One-line core idea	A sentence the learner could quote a year later
Problem	Why not knowing this hurts	A concrete scenario, not an abstract motivation
Concept	Mental model + diagram	Mermaid or hand-drawn; no code yet
Build It	From-scratch implementation	Pure language primitives. No frameworks.
Use It	Same thing with a real library	Map every from-scratch piece to its framework counterpart
Ship It	The artifact this lesson produces	Path to the file under `outputs/`

The two built-in skills

These are the only skills the repo ships. They're not subject-matter skills — those live under outputs/skills/ and are produced by the learner.

`/find-your-level` — placement

A short quiz (the original uses 10 questions) that maps the learner's prior knowledge to a starting phase, then prints a personalized path with hour estimates.

SKILL.md should:

Define the question pool. Each question tags the phase(s) it gates.
Define a scoring rule that returns a phase number.
Print: starting phase, recommended phases to skip, estimated hours.

`/check-understanding <phase>` — per-phase checkpoint

A shorter quiz (8 questions in the original) for one phase. Returns a pass/fail plus a list of specific lessons to review on miss.

SKILL.md should:

Take a phase number as argument.
Pull questions from a per-phase pool (one file per phase under the skill's data directory).
On miss, cite the lesson path that teaches the missed concept.

These two skills replace what would otherwise be a placement test buried in a README and a self-assessment buried in a footer. Skills move both to where the learner already is — inside Claude Code.

The outputs/ namespace

Filenames declare install target. Pick the right prefix and the artifact is self-routing.

Prefix	Install target	Example
`prompt-*.md`	Paste into any chat assistant	`prompt-debugger.md`
`skill-*.md`	Drop into `.claude/skills/` (or Cursor / Codex equivalent)	`skill-probability-reasoning.md`
`agent-*.md`	Long-running autonomous worker	`agent-doc-summarizer.md`
`mcp-server-*.{py,ts}`	MCP-compatible host	`mcp-server-vector-search.py`

The repo-level outputs/<type>/ directories are populated by symlinks or copies from the per-lesson outputs/. The aggregator script lives in scripts/.

Authoring conventions (the rules that make forks survive)

Five rules you don't customize:

Code is comment-free. Inline comments rot. Explanation belongs in docs/en.md. This is enforced at review.
docs/en.md is canonical. Translations sit beside it (docs/ko.md, docs/es.md); en.md is the source of truth when they diverge.
LESSON_TEMPLATE.md is the spec. New lessons copy it verbatim and fill the six headings. Don't paraphrase the headings.
ROADMAP.md uses three status markers. ✅ complete, 🚧 in progress, ⬚ planned. Anything else fragments the dashboard.
MIT-licensed with FORKING.md. The structural pattern is the deliverable; permissive licensing makes that real.

These five are the actual reusable kernel. Drop any of them and the format starts collapsing into a generic course.

Mapping source material into the format

You said you have books and resources. Here is how they map.

A book typically becomes one phase or a small group of phases. Chapters do not map one-to-one to lessons — chapters are too long and uneven. Instead:

Read the chapter and extract the concrete things you'd want to be able to do afterwards. Each one is a candidate lesson.
Filter: keep only the ones where Build-It is meaningful. If the only honest "Build It" is pip install x, the concept doesn't deserve a lesson — it deserves a paragraph in another lesson's docs.
The book becomes a citation in docs/en.md ("Concept" section) and never appears as a substitute for the from-scratch implementation.

A paper becomes either one Build-It (reimplement the central algorithm) or one Concept (cite + explain) inside a lesson, never a phase.

A video / blog post becomes a side note in docs/en.md — useful when the learner gets stuck on intuition, never load-bearing.

Glossary entries are extracted as you go: every term that appears in two or more lessons' docs/en.md graduates to glossary/terms.md.

Minimum viable seed

You don't need 20 phases on day one. The smallest version of this format that still works:

<repo>/
├── .claude/skills/find-your-level/SKILL.md   # even if it just returns "Phase 1"
├── phases/01-<topic>/01-<first-lesson>/      # one lesson, six headings, one artifact
│   ├── code/
│   ├── docs/en.md
│   └── outputs/prompt-<name>.md
├── outputs/prompts/                          # symlink/copy from above
├── LESSON_TEMPLATE.md
├── ROADMAP.md                                # one line: ⬚ Phase 1
└── README.md

Ship that. Then the next lesson. Then /check-understanding once Phase 1 has 3+ lessons. Then Phase 2. Skills get refined as the question bank grows.

The trap is treating the 20-phase original as a prerequisite. It isn't — it's an end state. The format works at one phase, one lesson, one artifact.

What you'll customize per repo

The phases list and their order
The languages used in code/ (the original picks four; one is fine)
The exact question banks for the two skills
The glossary
The toolkit's domain — prompts about RAG vs prompts about distributed systems vs prompts about board game design

What you won't customize: the per-lesson layout, the 6-step lifecycle, the two skills' purpose, the outputs/ naming convention, the five authoring rules.

If you're considering customizing one of those, you're considering a different format. That's fine — but name it differently so future readers know which spec they're inheriting from.

The Course-as-Toolkit Format

The one bet the format makes

Top-level repo layout

The lesson unit

The 6-step lesson lifecycle

The two built-in skills

`/find-your-level` — placement

`/check-understanding <phase>` — per-phase checkpoint

The outputs/ namespace

Authoring conventions (the rules that make forks survive)

Mapping source material into the format

Minimum viable seed

What you'll customize per repo

See also

The one bet the format makes

Top-level repo layout

The lesson unit

The 6-step lesson lifecycle

The two built-in skills

`/find-your-level` — placement

`/check-understanding <phase>` — per-phase checkpoint

The outputs/ namespace

Authoring conventions (the rules that make forks survive)

Mapping source material into the format

Minimum viable seed

What you'll customize per repo

See also

The one bet the format makes

Top-level repo layout

The lesson unit

The 6-step lesson lifecycle

The two built-in skills

/find-your-level — placement

/check-understanding <phase> — per-phase checkpoint

The outputs/ namespace

Authoring conventions (the rules that make forks survive)

Mapping source material into the format

Minimum viable seed

What you'll customize per repo

See also

The one bet the format makes

Top-level repo layout

The lesson unit

The 6-step lesson lifecycle

The two built-in skills

/find-your-level — placement

/check-understanding <phase> — per-phase checkpoint

The outputs/ namespace

Authoring conventions (the rules that make forks survive)

Mapping source material into the format

Minimum viable seed

What you'll customize per repo

See also

`/find-your-level` — placement

`/check-understanding <phase>` — per-phase checkpoint

`/find-your-level` — placement

`/check-understanding <phase>` — per-phase checkpoint