The Skill Building Habit: Teaching Your AI to Remember

After months of daily AI-assisted development, I stopped asking for the same things over and over. I built commands instead.

A "skill" is a reusable workflow your AI agent can invoke. Think of it as a macro, but smarter — it knows your codebase patterns and can adapt.

Here's how to discover and build them.

The Problem: Repetition Without Memory

After months of AI use, I noticed patterns:

Repetition: I kept asking for the same things — "run tests," "open this file," "review this PR"
Context loss: Each time, I re-explained what "review" means in my codebase
Inconsistency: Sometimes the AI ran pytest, sometimes pytest --cov, sometimes with the wrong flags

Every session started from zero. The AI doesn't remember that we've done this exact thing 50 times before.

Skills solve all three. Define once, invoke consistently, context preserved.

What a Skill Looks Like

Simple examples:

Trigger: "/review"
Action: Full PR review covering correctness, security, performance, testing

Trigger: "/checkpoint"
Action: Save current session state to history file for later branching

Skills sit between "ask the AI to do something from scratch" and "write a bash script." They encode intent that the agent understands and can execute consistently.

Anatomy of a Good Skill

Every skill needs four components:

Component	Purpose	Example
Trigger	How to invoke it	`pych <file>`, `/review`, `/checkpoint`
Scope	What it operates on	Single file, current branch, whole repo
Behavior	What it does	Open file, run commands, generate output
Output	What you get back	File opened, test results, markdown summary

Bad skill: "Do code review stuff"

Good skill: "/review — Full PR review: correctness, security, performance, edge cases, testing. Output structured table with severity levels."

The Five Categories

After iteration, my skills fall into five categories:

1. Code Review Skills

Different review depths for different contexts:

Type	When to Use	What It Catches
Standard review	Daily PRs	Correctness, security basics, performance
Strict review	Critical paths, auth code	Zero tolerance for type suppressions, empty catches
Architecture review	Cross-system changes	Impact across repositories, contract breaks
Frontend review	UI changes	React patterns, accessibility, CSS issues
Persona reviews	Pre-merge sanity check	"What would [specific perspective] catch?"

The persona reviews were a surprise hit. Different reviewers catch different things — one perspective obsesses over naming, another over edge cases. Encoding these perspectives means catching more issues before pushing.

2. Planning & Scoping Skills

For estimating and breaking down work:

Type	Purpose
Scope estimator	How big is this really?
Risk identifier	What could blow up unexpectedly?
Feasibility check	Can we do this in the time we have?

These are most valuable for ambiguous requests. "Add caching" sounds simple until the skill identifies multiple layers that might be involved.

3. Codebase Understanding Skills

For navigating and understanding unfamiliar code:

Type	Purpose
Flow tracer	Follow data through the system
Area explainer	What does this module do?
Pattern matcher	Find similar implementations elsewhere

These save hours when diving into unfamiliar areas. Instead of grep + read + grep + read, one command gives me the full picture.

4. Repository-Specific Skills

Patterns unique to each codebase:

Type	Purpose
Endpoint scaffolder	Create new API endpoint following local conventions
Migration generator	Database schema changes with proper patterns
Test scaffolder	New test file matching existing test structure
Boilerplate reducer	Standard setup for new modules/packages

These encode tribal knowledge. "How do we add endpoints here?" becomes a one-word command that produces code matching existing patterns exactly.

5. Session & Continuity Skills

For maintaining context across sessions:

Type	Purpose
Checkpoint	Save state at decision points for branching
Handoff	Generate continuation prompt for new session
Context updater	Keep architecture docs current
Health checker	Overall codebase health metrics
Debt scanner	Find and categorize technical debt

Discovery: Finding Skills to Build

Spend a week tracking what you ask your AI assistant for repeatedly:

"Review this before I push" → review skill
"What's the pattern for X here?" → scaffolding skill
"Trace how this data flows" → navigation skill
"I want to try a different approach" → checkpoint skill
"Is this a small or large project?" → scoping skill

The best skills come from observed repetition, not imagination. Don't build a dozen skills on day one. Track what you actually repeat, then encode it.

Case Study: The Checkpoint Skill

My favorite skill. Here's how it evolved:

Version 1: Stuck in a Bad Direction

You're 20 messages deep into debugging. The AI has gone down a path that isn't working. You want to go back to message 8 and try a different approach.

But you can't. The conversation is linear. You'd have to start over and re-explain everything.

Version 2: Re-explain From Scratch

My workaround: start a new session and re-explain the entire context. "I'm working on X. I tried Y and Z but they didn't work. Now I want to try A instead."

This worked but was tedious. I'd forget important context. The new session didn't know what had already been tried.

Version 3: Checkpoint at Decision Points

Now I checkpoint at key decision points — before committing to an approach:

/checkpoint

Generates:
- What was accomplished so far
- Key decisions made (and why)
- Approaches tried and failed
- Current hypothesis
- Files modified

Saves to a history file I can retrieve later.

When I want to branch and try a different approach, I start a fresh session and paste the checkpoint. The new session knows exactly where I was and what didn't work.

The key insight: Checkpoints aren't just history — they're branch points. They let you explore alternative approaches without losing context about what you've already tried.

Where Skills Live

Skills need a home where your AI can find them. Two tiers work well:

Global skills — Work across any codebase:

Code review standards
Planning and estimation
Session management (checkpoints, handoffs)

Project-specific skills — Encode patterns unique to one repo:

"How we structure endpoints here"
"Our component patterns"
"Integration test conventions"

Keep them separate. Global skills in one location; project skills in each project. Your AI reads both and knows which apply where.

The Navigation Skill: Understanding Unfamiliar Code

The skill I use most isn't for writing code — it's for understanding it.

When I land in an unfamiliar part of the codebase, I used to do this dance:

Jump to definition
Read that file
Find usages
Jump to one of those
Read that file
Repeat until I understand the flow

Now: /trace user-signup

The skill knows to:

Start from the entry point (route handler, CLI command, etc.)
Follow the data through each layer
Note transformations, validations, and side effects
Identify where exceptions are caught (or not)
Map out which services/databases are touched

Output is a flow diagram in text:

POST /api/users/signup
  → validate_request(data) [raises ValidationError]
  → UserRepository.get_by_email() [DB: users table]
  → hash_password() [bcrypt]
  → UserRepository.create() [DB insert]
  → send_welcome_email() [async task, non-blocking]
← Response: { user_id, email, created_at }

What used to take 30 minutes of file-hopping now takes 30 seconds. And the output is something I can paste into a PR description or share with a teammate asking "how does signup work?"

The skill evolved over time:

v1: Just trace the happy path
v2: Added exception paths after getting burned by unhandled errors
v3: Added "what touches the database?" after a performance incident
v4: Added async tasks after debugging a race condition

Each version came from pain. I didn't design it upfront — I refined it when the current version failed me.

Gotchas and Lessons

1. Prefix Conflicts

If you use /skill, your tool might try to parse it as a built-in command. This happened when I built my session handoff plugin — I wanted /handoff but the slash prefix collided with existing commands, so it became session handoff instead.

Options:

Use a different trigger style (two words, no slash)
Prefix with something unique (my-, do-, etc.)
Accept the collision and work around it

2. Skills Need Maintenance

Codebase patterns change. Skills that encode old patterns become misleading. Review quarterly:

Does the command still work?
Does the pattern still apply?
Is the tool/framework still correct?

3. Don't Over-Skill

Not everything needs a skill. If you've only done it twice, it's not a pattern yet.

Skill-worthy: Things you do weekly Not skill-worthy: One-off tasks, edge cases

4. Start Small

Start with 3 skills, not 27. Add more as patterns emerge. You'll know when you need them because you'll catch yourself typing the same prompt again.

The Value Hierarchy

After 6 months, here's what provides the most value:

Codebase-specific skills — They encode knowledge that lives only in your head
Policy-enforcement skills — They make team standards automatic
Checkpoint/history skills — They create a searchable record of decisions
General development skills — They standardize common tasks

The codebase-specific ones surprise people the most. "Wait, I can just say /new-component and it knows our factory pattern?"

Yes. That's the point.

Getting Started

Track for a week: What do you repeatedly ask your AI assistant?
Pick the top 3: Most frequent, most annoying to re-explain
Define them: Trigger, scope, behavior, output
Document them: So the AI knows about them
Iterate: Skills evolve as you use them

The first three skills will teach you what works. Then add more.

The Meta-Skill

Building skills is itself a skill. You learn:

What's actually repetitive vs. what just feels repetitive
How to specify behavior precisely
When to encode flexibility vs. rigidity
How to maintain and evolve commands over time

This meta-skill transfers. Once you think in terms of "encode the pattern," you see opportunities everywhere — not just with AI.

Start with 3. You'll know when you need more.