Claude Code Skills vs MCP — Know Which Layer to Reach For
Skills and MCP solve different problems in Claude Code. Here's when each wins, when to combine them, and what the optimal hybrid looks like for solo devs and small teams.
The stack (3 tools)
Transparent, token-efficient, no server overhead, native to Claude Code with progressive context loading
Cross-platform standard, enterprise audit trails, 10,000+ server catalog for external system access
Concrete proof that a Skills-first architecture can replace what most developers reflexively reach for MCP to do
TL;DR
- Skills-first for solo/small teams: Encode known workflows as lazy-loaded Markdown modules — transparent, token-efficient, zero server overhead
- MCP for external integrations and audit: Use it when Claude needs to touch GitHub, databases, or APIs that aren’t filesystem-accessible; MCP handles cross-client portability and structured logging
- The hybrid wins: CLAUDE.md as a thin routing index, Skills as your workflow layer, MCP as your data gateway — each layer doing what it’s actually good at
Most of our MCP coverage assumes MCP is the default infrastructure choice for Claude Code. It isn’t. We’ve been overcorrecting toward protocol complexity when the actual problem — for solo devs and small teams — is workflow discipline. Skills solve that better, cheaper, and with full transparency. This article is us correcting our own MCP-maximalist bias.
The Skills vs MCP question isn’t a competition. It’s a layer problem. Get the layers wrong and you burn a third of your context window before typing a single prompt. Get them right and Claude behaves like it has working memory of your codebase, your standards, and your release process.
Stack Overview
This stack gives you a hybrid Claude Code setup that uses Skills for workflow orchestration and MCP for external data access — with a clear decision rule for which to reach for first.
Components:
- Claude Skills — on-demand Markdown modules that load procedural knowledge when Claude needs it, not before
- MCP (Model Context Protocol) — the external connectivity layer for GitHub, databases, APIs, and services Claude can’t touch via the filesystem
- gstack — the reference implementation showing what a Skills-first workflow looks like at production scale
What the combination solves that either alone cannot: Skills give Claude structured workflows without consuming context permanently. MCP gives Claude access to external systems without you writing glue code. Together, they let you build a Claude Code setup where the agent knows how to work (Skills) and what to work with (MCP) — without the two competing for your context budget.
Architecture flow (text version): Claude Code session → loads thin CLAUDE.md → routes workflow tasks to the Skills layer (/plan, /review, /qa, /ship) and external data tasks to the MCP layer (GitHub, Postgres, Sentry) → both layers feed outputs back to the session.
Components
Claude Skills (October 2025, stable April 2026)
Skills are folders containing a SKILL.md file — YAML frontmatter plus Markdown instructions — that Claude discovers and loads when the task matches. Think of them as on-demand training manuals: Claude scans available Skills, finds semantic matches, and loads only what it needs for the current task. The rest stays out of context.
The critical mechanic is progressive disclosure. Skills descriptions load at roughly 30–50 tokens each. The full Skill content only loads when invoked. Compare that to putting everything in CLAUDE.md upfront: one team (the ClaudeFast Code Kit project) documented an 82% token reduction — recovering roughly 15,000 tokens per session — by migrating from always-on CLAUDE.md instructions to Skills-based progressive loading. That’s not a marginal improvement; at 200k context it’s the difference between hitting a wall mid-task and having room to finish.
Why Skills in this stack:
- Transparent by design — you read the Markdown, you know exactly what Claude will do
- No external server process to maintain or authenticate
disable-model-invocation: truein frontmatter prevents Claude from auto-triggering side-effect workflows (deploys, Slack messages, commits) without explicit user invocation- Lazy loading recovers substantial context budget versus front-loaded
CLAUDE.mdbloat
| Alternative | Difference | Switch if |
|---|---|---|
| CLAUDE.md instructions | Always loaded, consumes context permanently | You have <5 universal behaviors that apply to every single session |
| MCP prompts | Protocol-dependent, external server | You need the same procedural knowledge to work across multiple AI clients |
| Custom system prompts | No Anthropic-supported structure, not portable | You’re not using Claude Code |
MCP — Model Context Protocol (spec 2025-03-26, Linux Foundation governance Nov 2025)
MCP is the open standard for connecting Claude to external systems — GitHub, databases, Slack, Figma, anything not accessible via the filesystem. The protocol now has backing from Anthropic, OpenAI, Google, Microsoft, AWS, Cloudflare, Block, and Bloomberg, with a catalog exceeding 10,000 active servers.
The problem MCP creates is token overhead. Seven MCP servers consume 67,300 tokens — 33.7% of Claude’s 200k context budget — before you’ve written a single message. GitHub’s official MCP alone uses tens of thousands of tokens. This is the real cost most tutorials skip: you’re paying for tool definitions whether you use those tools in a given session or not.
Anthropic’s Tool Search feature (lazy tool loading) reduces this overhead by 85% by withholding tool definitions until Claude actually needs them. Use it. But even with Tool Search, the rule holds: install MCP servers for external systems you actually call in your workflow, not ones you might need someday. And consider the CLI alternative before you reach for MCP at all — LLMs already know how to call gh --help, aws --help, psql --help. You don’t have to spend tokens describing how to use tools that have well-known CLIs.
Why MCP in this stack:
- No alternative for external system access (GitHub PRs, database queries, monitoring alerts)
- Cross-platform portability — same MCP server works in Cursor, Claude Desktop, other clients
.mcp.jsonenables team-shared configurations without manual setup per developer- Enterprise audit trails for tool invocations, scoped permissions per project
| Alternative | Difference | Switch if |
|---|---|---|
| CLI tools (gh, aws, psql) | More token-efficient; LLMs know --help natively | You’re comfortable with Claude using shell commands |
| Direct API calls in Skills | Encoded in Markdown, no separate server | The API is simple enough to describe as procedure |
| Skills with bash scripts | Fully local, no server | You only need filesystem + git access |
gstack (v0.3.3+)
Garry Tan’s open-source skill pack surpassed 10,000 GitHub stars in its first 48 hours and exceeded 33,000 stars by the end of week one. Then it attracted the most useful criticism it could have received: “It’s just a bunch of prompts in a text file.” That’s exactly what makes it worth studying.
A quick clarification on counts: the README header describes “23 opinionated tools” as the design intent, but the actual shipped repository lists 30+ slash commands — /plan-ceo-review, /plan-eng-review, /design-consultation, /review, /ship, /land-and-deploy, /canary, /qa, /retro, /investigate, /benchmark, /browse, /codex, and more. The number is a moving target; treat “30+” as accurate and the “23” as legacy branding.
The /browse and /qa skills are where “a bunch of prompts” undersells the engineering: they run a persistent Chromium daemon via a compiled 58MB Bun binary. First invocation takes about 3 seconds for browser cold start. Every subsequent call completes in 100–200ms via localhost HTTP. Cookies, tabs, and localStorage persist across commands. Auto-shutdown after 30 minutes idle. The Bun binary also reads Chromium’s SQLite cookie database directly, removing the need for a separate runtime toolchain. This is a deliberate architectural decision — not a prompt trick.
Tan’s reported output using this setup: 10,000 lines of code and 100 pull requests per week over 50 days. Use gstack as a reference architecture, not a cargo-cult install. The value isn’t the specific prompts — it’s the pattern of encoding role-based perspectives (CEO product thinking, engineering manager architecture review, QA tester paranoia) as callable procedures.
Why gstack in this stack:
- Proves Skills-first works at production cadence, not just in demos
- Browser automation without an MCP server — faster, simpler, local
- Role-based agent structure shows how to encode institutional knowledge as callable procedures
| Alternative | Difference | Switch if |
|---|---|---|
| Playwright MCP | Protocol-based browser control, cross-client | You need browser automation across multiple AI tools |
| Custom Skills from scratch | Full control, no conventions | You need workflows gstack doesn’t cover |
| Claude with no Skills | Generic behavior, no encoded procedures | You’re evaluating before committing to a workflow |
Setup Walkthrough
Step 1: Initialize your CLAUDE.md as a routing index, not an instruction manual
CLAUDE.md should stay under 150 lines. It’s your session-universal context: project identity, tech stack, critical constraints. Everything procedural moves to Skills. The routing table at the bottom is how Claude learns which Skill to load before you’ve even finished typing your request.
# .claude/CLAUDE.md
## Project
[Project name and one-sentence purpose]
## Stack
[Language, framework, database, deployment target]
## Hard constraints
- Never commit directly to main
- All deploys require /ship skill invocation
- DB migrations require human approval
## Active Skills
| Trigger | Skill | When to use |
|---------|-------|-------------|
| planning, architecture | /plan | Before starting any feature |
| code review, PR | /review | Before any merge |
| testing, browser | /qa | After implementation |
| release, deploy | /ship | Only when explicitly invoked |
Step 2: Create your Skills directory structure
Skills live in .claude/skills/. Each Skill gets its own directory with a SKILL.md file. This is the structure Claude’s discovery mechanism expects — flat files won’t work.
# Terminal
mkdir -p .claude/skills/review
mkdir -p .claude/skills/plan
mkdir -p .claude/skills/ship
mkdir -p .claude/skills/qa
Step 3: Write a Skill — the /review example
The frontmatter controls invocation behavior. The Markdown content is what Claude executes. Keep descriptions specific — Claude uses semantic matching against them. “Does stuff” won’t trigger. “Reviews code for security issues, logic errors, and adherence to team conventions; use when asked to audit a PR or review recent changes” will. Invest the 30 seconds writing a description that tells Claude exactly when this Skill applies.
# .claude/skills/review/SKILL.md
---
name: review
description: "Reviews code for security issues, logic errors, performance problems, and team convention violations. Invoked when asked to review code, audit a PR, or check recent changes before merging."
disable-model-invocation: false
---
## Code Review Protocol
1. **Security first**: Check for injection vulnerabilities, exposed secrets, auth bypass paths
2. **Logic errors**: Trace the happy path, then two failure modes
3. **Performance**: Flag N+1 queries, unbounded loops, missing indexes
4. **Conventions**: Verify naming, error handling patterns, and test coverage match project standards
## Output format
- Summary: one sentence verdict
- Critical issues: must fix before merge (list)
- Minor issues: should fix (list)
- Approved: explicit yes/no
Step 4: Write a dangerous Skill with invocation protection
Any Skill with side effects — deploys, commits, external messages — gets disable-model-invocation: true. This means Claude cannot auto-trigger it based on context matching. Only explicit user invocation (/ship) activates it. This is the guardrail that prevents Claude from deciding the code looks ready and deploying without asking.
# .claude/skills/ship/SKILL.md
---
name: ship
description: "Executes the full release checklist: tests, build, changelog, version bump, PR, deploy. Only run when explicitly asked to ship or release."
disable-model-invocation: true
---
## Release Checklist
1. Confirm all tests pass: `npm test`
2. Run build: `npm run build`
3. Update CHANGELOG.md with changes since last release
4. Bump version in package.json
5. Create PR with release notes
6. **STOP. Request human approval before deploy.**
7. On approval: execute deploy command
## Never auto-trigger
This skill has destructive potential. Confirm intent before each step.
Step 5: Add MCP servers only for external systems you actually use
Create or update .mcp.json in your project root. Only include servers for systems you actively integrate this session. The token cost is real — every listed server loads definition tokens whether used or not unless Tool Search is active. Two servers is a reasonable default for most solo dev setups.
// .mcp.json
{
"mcpServers": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_TOKEN}"
}
},
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres"],
"env": {
"DATABASE_URL": "${DATABASE_URL}"
}
}
}
}
Step 6: Create your .env.example for MCP credentials
Never put real credentials in .mcp.json or commit them. Environment variables only. The .env.example is documentation for yourself six months from now and for any collaborators who clone the repo.
# .env.example
# GitHub MCP Server
GITHUB_TOKEN=ghp_your_token_here
# Postgres MCP Server
DATABASE_URL=postgresql://user:password@localhost:5432/dbname
# Add to .gitignore: .env
Step 7: Test Skill invocation and monitor context budget
After loading your MCP servers, check what percentage of budget they consume before any conversation starts. If you’re already above 20%, audit which MCP servers are actually necessary for this session. The /compact command is your friend when approaching the context limit mid-session.
# In Claude Code terminal
/config # View current config and context usage
/compact # Compress conversation history when approaching 50% capacity
/review # Test explicit Skill invocation by name
# Then test natural language: "Can you review the last changes I made?"
# Claude should auto-invoke /review via semantic matching on the description
Pricing
| Component | License | Free Tier | Paid from | Note |
|---|---|---|---|---|
| Claude Skills | MIT (Anthropic) | Unlimited | — | Included in Claude Code; no separate cost |
| Claude Code | Proprietary | — | $20/mo (Max) | Skills require Claude Code Max plan |
| MCP Protocol | Apache 2.0 | Unlimited | — | Protocol is free; server hosting varies |
| MCP servers (self-hosted) | Various | Unlimited | $0–$10/mo (hosting) | Hetzner VPS ~$4/mo handles most setups |
| gstack | MIT | Unlimited | — | Open source; browser binary is free |
Prices as of April 2026.
Self-hosting MCP servers adds ~$4–10/month for a basic VPS. Managed MCP hosting services exist but add unnecessary complexity for solo devs — run MCP servers locally or on a cheap VPS you control.
Claude Code requires the Max plan ($20/mo) for Skills workflows to be reliable in practice. The Pro plan’s context limits cause complex Skills sessions to hit ceilings mid-task. As of April 2026, if Skills are central to your workflow, the Max plan is the real floor — budget accordingly. Check Claude’s official pricing page for current plan details, as Anthropic updates these periodically.
When This Stack Fits
You have known, stable workflows. If your code review process, planning ritual, and release checklist are consistent enough to write down, they’re consistent enough to encode as Skills. If your process changes every week, you’re not ready for Skills — you’ll spend more time rewriting Markdown than shipping code.
You’re solo or a team of fewer than 10. .mcp.json sharing via git handles team MCP config. A shared Skills directory in version control handles team workflows. Beyond 10 people, you’ll want dedicated MCP server infrastructure and a formal Skills governance process this stack doesn’t provide.
You want full transparency over what the agent does. Every Skill is a readable Markdown file. You can audit exactly what Claude will do before you invoke anything. If you’ve been burned by black-box agent behavior, this is the antidote — there’s nothing opaque here.
Token efficiency matters for your session length. If you’re running long sessions with complex codebases, the difference between 33% of context consumed by idle MCP servers versus Skills that load on-demand is the difference between hitting context limits mid-task or not. This compounds over long sessions.
You need to integrate with external services. You have a GitHub repo, a production database, or monitoring alerts that Claude needs to read. MCP handles this cleanly. Trying to solve it via Skills bash scripts creates fragile workarounds that break when APIs change.
When This Stack Does Not Fit
You need audit trails for agent actions. Skills leave no structured log of what was invoked, when, or by whom. If your organization requires audit trails for AI-assisted work — compliance, security review, team accountability — MCP’s structured tool invocation logging is what you need. Skills won’t satisfy this requirement, and you shouldn’t try to bolt logging onto them.
You’re building tooling for multiple AI clients. If the same workflow needs to run in Cursor, Claude Desktop, and Claude Code, encode it as an MCP server or prompt, not a Skill. Skills are Claude Code-native. MCP is the cross-platform layer. For teams building developer tooling rather than using it, MCP’s portability is the entire point.
Your workflows are genuinely undiscovered. Skills work best for known procedures you can articulate. If you’re in early exploration mode — unsure what your workflow even is — the 10,000+ MCP server catalog is actually valuable for tool discovery. You can try tools without writing Skills for them first. Use MCP to discover, then encode what you find into Skills once patterns emerge.
You’re replacing CI/CD. Neither Skills nor MCP is a deployment pipeline. /ship with disable-model-invocation: true can orchestrate a release checklist, but it is not GitHub Actions. It has no retry logic, no structured failure handling, and no audit history. Use proper CI/CD for production deployments; use Skills to prepare for and coordinate them.
The Honest Architecture
Simon Willison’s observation that Skills are “maybe a bigger deal than MCP” points at something real: the productivity gains in Claude Code setups come from workflow discipline, not protocol complexity. Skills encode that discipline in a format Claude can actually use — readable, lazily loaded, semantically indexed.
But MCP’s value isn’t protocol elegance. It’s that someone else already built the GitHub integration, the Postgres connector, the Sentry bridge. You don’t write glue code. You install a server, configure credentials, and Claude can query your production database. That’s genuinely hard to replicate with Skills bash scripts without rebuilding what MCP servers already provide.
The pattern that works in practice: Skills as your workflow layer, MCP as your data gateway. CLAUDE.md stays thin — project identity, hard constraints, a routing table pointing at Skills. Skills hold the procedural knowledge: how to plan a feature, how to review code, how to run QA, how to ship. MCP handles the external calls: read this GitHub PR, query this database, fetch this Sentry error.
What neither layer does: replace the domain expertise that makes a workflow worth encoding in the first place. gstack is compelling not because it’s prompts in text files but because Garry Tan’s opinions about software engineering are worth encoding. Your Skills are only as good as the workflows they capture. That part is still on you.
The MCP-maximalist position — install everything, route everything through the protocol — creates setups where Claude spends a third of its context budget on tool definitions it won’t use this session. The Skills-first position with surgical MCP usage spends that budget on the actual work. For solo devs and small teams grinding features, that trade is obvious. Make it deliberately rather than by default.