Claude Code Skills vs MCP — Know Which Layer to Reach For

Solo developers and small teams building structured Claude Code workflows

Skills and MCP solve different problems in Claude Code. Here's when each wins, when to combine them, and what the optimal hybrid looks like for solo devs and small teams.

claude-codemcpskillsagentic-workflowsstack 13 min · Apr 4, 2026

The stack (3 tools)

Claude Skills Free (included in Claude Code)

Workflow orchestration layer — lazy-loaded Markdown modules for repeatable procedures

Transparent, token-efficient, no server overhead, native to Claude Code with progressive context loading

MCP (Model Context Protocol) Free protocol; hosted MCP servers vary

Data gateway — connects Claude to external APIs, databases, and services

Cross-platform standard, enterprise audit trails, 10,000+ server catalog for external system access

gstack Free (open source)

Reference Skills implementation — 30+ slash commands encoding role-based agent workflows

Concrete proof that a Skills-first architecture can replace what most developers reflexively reach for MCP to do

total / month Free - $20/mo

TL;DR

Skills-first for solo/small teams: Encode known workflows as lazy-loaded Markdown modules — transparent, token-efficient, zero server overhead
MCP for external integrations and audit: Use it when Claude needs to touch GitHub, databases, or APIs that aren’t filesystem-accessible; MCP handles cross-client portability and structured logging
The hybrid wins: CLAUDE.md as a thin routing index, Skills as your workflow layer, MCP as your data gateway — each layer doing what it’s actually good at

Most of our MCP coverage assumes MCP is the default infrastructure choice for Claude Code. It isn’t. We’ve been overcorrecting toward protocol complexity when the actual problem — for solo devs and small teams — is workflow discipline. Skills solve that better, cheaper, and with full transparency. This article is us correcting our own MCP-maximalist bias.

The Skills vs MCP question isn’t a competition. It’s a layer problem. Get the layers wrong and you burn a third of your context window before typing a single prompt. Get them right and Claude behaves like it has working memory of your codebase, your standards, and your release process.

Stack Overview

This stack gives you a hybrid Claude Code setup that uses Skills for workflow orchestration and MCP for external data access — with a clear decision rule for which to reach for first.

Components:

Claude Skills — on-demand Markdown modules that load procedural knowledge when Claude needs it, not before
MCP (Model Context Protocol) — the external connectivity layer for GitHub, databases, APIs, and services Claude can’t touch via the filesystem
gstack — the reference implementation showing what a Skills-first workflow looks like at production scale

What the combination solves that either alone cannot: Skills give Claude structured workflows without consuming context permanently. MCP gives Claude access to external systems without you writing glue code. Together, they let you build a Claude Code setup where the agent knows how to work (Skills) and what to work with (MCP) — without the two competing for your context budget.

Architecture flow (text version): Claude Code session → loads thin CLAUDE.md → routes workflow tasks to the Skills layer (/plan, /review, /qa, /ship) and external data tasks to the MCP layer (GitHub, Postgres, Sentry) → both layers feed outputs back to the session.

Components

Claude Skills (October 2025, stable April 2026)

Skills are folders containing a SKILL.md file — YAML frontmatter plus Markdown instructions — that Claude discovers and loads when the task matches. Think of them as on-demand training manuals: Claude scans available Skills, finds semantic matches, and loads only what it needs for the current task. The rest stays out of context.

The critical mechanic is progressive disclosure. Skills descriptions load at roughly 30–50 tokens each. The full Skill content only loads when invoked. Compare that to putting everything in CLAUDE.md upfront: one team (the ClaudeFast Code Kit project) documented an 82% token reduction — recovering roughly 15,000 tokens per session — by migrating from always-on CLAUDE.md instructions to Skills-based progressive loading. That’s not a marginal improvement; at 200k context it’s the difference between hitting a wall mid-task and having room to finish.

Why Skills in this stack:

Transparent by design — you read the Markdown, you know exactly what Claude will do
No external server process to maintain or authenticate
disable-model-invocation: true in frontmatter prevents Claude from auto-triggering side-effect workflows (deploys, Slack messages, commits) without explicit user invocation
Lazy loading recovers substantial context budget versus front-loaded CLAUDE.md bloat

Alternative	Difference	Switch if
CLAUDE.md instructions	Always loaded, consumes context permanently	You have <5 universal behaviors that apply to every single session
MCP prompts	Protocol-dependent, external server	You need the same procedural knowledge to work across multiple AI clients
Custom system prompts	No Anthropic-supported structure, not portable	You’re not using Claude Code

MCP — Model Context Protocol (spec 2025-03-26, Linux Foundation governance Nov 2025)

MCP is the open standard for connecting Claude to external systems — GitHub, databases, Slack, Figma, anything not accessible via the filesystem. The protocol now has backing from Anthropic, OpenAI, Google, Microsoft, AWS, Cloudflare, Block, and Bloomberg, with a catalog exceeding 10,000 active servers.

The problem MCP creates is token overhead. Seven MCP servers consume 67,300 tokens — 33.7% of Claude’s 200k context budget — before you’ve written a single message. GitHub’s official MCP alone uses tens of thousands of tokens. This is the real cost most tutorials skip: you’re paying for tool definitions whether you use those tools in a given session or not.

Anthropic’s Tool Search feature (lazy tool loading) reduces this overhead by 85% by withholding tool definitions until Claude actually needs them. Use it. But even with Tool Search, the rule holds: install MCP servers for external systems you actually call in your workflow, not ones you might need someday. And consider the CLI alternative before you reach for MCP at all — LLMs already know how to call gh --help, aws --help, psql --help. You don’t have to spend tokens describing how to use tools that have well-known CLIs.

Why MCP in this stack:

No alternative for external system access (GitHub PRs, database queries, monitoring alerts)
Cross-platform portability — same MCP server works in Cursor, Claude Desktop, other clients
.mcp.json enables team-shared configurations without manual setup per developer
Enterprise audit trails for tool invocations, scoped permissions per project

Alternative	Difference	Switch if
CLI tools (gh, aws, psql)	More token-efficient; LLMs know `--help` natively	You’re comfortable with Claude using shell commands
Direct API calls in Skills	Encoded in Markdown, no separate server	The API is simple enough to describe as procedure
Skills with bash scripts	Fully local, no server	You only need filesystem + git access

gstack (v0.3.3+)

Garry Tan’s open-source skill pack surpassed 10,000 GitHub stars in its first 48 hours and exceeded 33,000 stars by the end of week one. Then it attracted the most useful criticism it could have received: “It’s just a bunch of prompts in a text file.” That’s exactly what makes it worth studying.

A quick clarification on counts: the README header describes “23 opinionated tools” as the design intent, but the actual shipped repository lists 30+ slash commands — /plan-ceo-review, /plan-eng-review, /design-consultation, /review, /ship, /land-and-deploy, /canary, /qa, /retro, /investigate, /benchmark, /browse, /codex, and more. The number is a moving target; treat “30+” as accurate and the “23” as legacy branding.

The /browse and /qa skills are where “a bunch of prompts” undersells the engineering: they run a persistent Chromium daemon via a compiled 58MB Bun binary. First invocation takes about 3 seconds for browser cold start. Every subsequent call completes in 100–200ms via localhost HTTP. Cookies, tabs, and localStorage persist across commands. Auto-shutdown after 30 minutes idle. The Bun binary also reads Chromium’s SQLite cookie database directly, removing the need for a separate runtime toolchain. This is a deliberate architectural decision — not a prompt trick.

Tan’s reported output using this setup: 10,000 lines of code and 100 pull requests per week over 50 days. Use gstack as a reference architecture, not a cargo-cult install. The value isn’t the specific prompts — it’s the pattern of encoding role-based perspectives (CEO product thinking, engineering manager architecture review, QA tester paranoia) as callable procedures.

Why gstack in this stack:

Proves Skills-first works at production cadence, not just in demos
Browser automation without an MCP server — faster, simpler, local
Role-based agent structure shows how to encode institutional knowledge as callable procedures

Alternative	Difference	Switch if
Playwright MCP	Protocol-based browser control, cross-client	You need browser automation across multiple AI tools
Custom Skills from scratch	Full control, no conventions	You need workflows gstack doesn’t cover
Claude with no Skills	Generic behavior, no encoded procedures	You’re evaluating before committing to a workflow

Setup Walkthrough

Step 1: Initialize your CLAUDE.md as a routing index, not an instruction manual

CLAUDE.md should stay under 150 lines. It’s your session-universal context: project identity, tech stack, critical constraints. Everything procedural moves to Skills. The routing table at the bottom is how Claude learns which Skill to load before you’ve even finished typing your request.

# .claude/CLAUDE.md

## Project
[Project name and one-sentence purpose]

## Stack
[Language, framework, database, deployment target]

## Hard constraints
- Never commit directly to main
- All deploys require /ship skill invocation
- DB migrations require human approval

## Active Skills
| Trigger | Skill | When to use |
|---------|-------|-------------|
| planning, architecture | /plan | Before starting any feature |
| code review, PR | /review | Before any merge |
| testing, browser | /qa | After implementation |
| release, deploy | /ship | Only when explicitly invoked |

Step 2: Create your Skills directory structure

Skills live in .claude/skills/. Each Skill gets its own directory with a SKILL.md file. This is the structure Claude’s discovery mechanism expects — flat files won’t work.

# Terminal
mkdir -p .claude/skills/review
mkdir -p .claude/skills/plan
mkdir -p .claude/skills/ship
mkdir -p .claude/skills/qa

Step 3: Write a Skill — the /review example

The frontmatter controls invocation behavior. The Markdown content is what Claude executes. Keep descriptions specific — Claude uses semantic matching against them. “Does stuff” won’t trigger. “Reviews code for security issues, logic errors, and adherence to team conventions; use when asked to audit a PR or review recent changes” will. Invest the 30 seconds writing a description that tells Claude exactly when this Skill applies.

# .claude/skills/review/SKILL.md
---
name: review
description: "Reviews code for security issues, logic errors, performance problems, and team convention violations. Invoked when asked to review code, audit a PR, or check recent changes before merging."
disable-model-invocation: false
---

## Code Review Protocol

1. **Security first**: Check for injection vulnerabilities, exposed secrets, auth bypass paths
2. **Logic errors**: Trace the happy path, then two failure modes
3. **Performance**: Flag N+1 queries, unbounded loops, missing indexes
4. **Conventions**: Verify naming, error handling patterns, and test coverage match project standards

## Output format
- Summary: one sentence verdict
- Critical issues: must fix before merge (list)
- Minor issues: should fix (list)
- Approved: explicit yes/no

Step 4: Write a dangerous Skill with invocation protection

Any Skill with side effects — deploys, commits, external messages — gets disable-model-invocation: true. This means Claude cannot auto-trigger it based on context matching. Only explicit user invocation (/ship) activates it. This is the guardrail that prevents Claude from deciding the code looks ready and deploying without asking.

# .claude/skills/ship/SKILL.md
---
name: ship
description: "Executes the full release checklist: tests, build, changelog, version bump, PR, deploy. Only run when explicitly asked to ship or release."
disable-model-invocation: true
---

## Release Checklist

1. Confirm all tests pass: `npm test`
2. Run build: `npm run build`
3. Update CHANGELOG.md with changes since last release
4. Bump version in package.json
5. Create PR with release notes
6. **STOP. Request human approval before deploy.**
7. On approval: execute deploy command

## Never auto-trigger
This skill has destructive potential. Confirm intent before each step.

Step 5: Add MCP servers only for external systems you actually use

Create or update .mcp.json in your project root. Only include servers for systems you actively integrate this session. The token cost is real — every listed server loads definition tokens whether used or not unless Tool Search is active. Two servers is a reasonable default for most solo dev setups.

// .mcp.json
{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_TOKEN}"
      }
    },
    "postgres": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres"],
      "env": {
        "DATABASE_URL": "${DATABASE_URL}"
      }
    }
  }
}

Step 6: Create your .env.example for MCP credentials

Never put real credentials in .mcp.json or commit them. Environment variables only. The .env.example is documentation for yourself six months from now and for any collaborators who clone the repo.

# .env.example
# GitHub MCP Server
GITHUB_TOKEN=ghp_your_token_here

# Postgres MCP Server
DATABASE_URL=postgresql://user:password@localhost:5432/dbname

# Add to .gitignore: .env

Step 7: Test Skill invocation and monitor context budget

After loading your MCP servers, check what percentage of budget they consume before any conversation starts. If you’re already above 20%, audit which MCP servers are actually necessary for this session. The /compact command is your friend when approaching the context limit mid-session.

# In Claude Code terminal
/config          # View current config and context usage
/compact         # Compress conversation history when approaching 50% capacity
/review          # Test explicit Skill invocation by name
# Then test natural language: "Can you review the last changes I made?"
# Claude should auto-invoke /review via semantic matching on the description

Pricing

Component	License	Free Tier	Paid from	Note
Claude Skills	MIT (Anthropic)	Unlimited	—	Included in Claude Code; no separate cost
Claude Code	Proprietary	—	$20/mo (Max)	Skills require Claude Code Max plan
MCP Protocol	Apache 2.0	Unlimited	—	Protocol is free; server hosting varies
MCP servers (self-hosted)	Various	Unlimited	$0–$10/mo (hosting)	Hetzner VPS ~$4/mo handles most setups
gstack	MIT	Unlimited	—	Open source; browser binary is free

Prices as of April 2026.

Self-hosting MCP servers adds ~$4–10/month for a basic VPS. Managed MCP hosting services exist but add unnecessary complexity for solo devs — run MCP servers locally or on a cheap VPS you control.

Claude Code requires the Max plan ($20/mo) for Skills workflows to be reliable in practice. The Pro plan’s context limits cause complex Skills sessions to hit ceilings mid-task. As of April 2026, if Skills are central to your workflow, the Max plan is the real floor — budget accordingly. Check Claude’s official pricing page for current plan details, as Anthropic updates these periodically.

When This Stack Fits

You have known, stable workflows. If your code review process, planning ritual, and release checklist are consistent enough to write down, they’re consistent enough to encode as Skills. If your process changes every week, you’re not ready for Skills — you’ll spend more time rewriting Markdown than shipping code.

You’re solo or a team of fewer than 10. .mcp.json sharing via git handles team MCP config. A shared Skills directory in version control handles team workflows. Beyond 10 people, you’ll want dedicated MCP server infrastructure and a formal Skills governance process this stack doesn’t provide.

You want full transparency over what the agent does. Every Skill is a readable Markdown file. You can audit exactly what Claude will do before you invoke anything. If you’ve been burned by black-box agent behavior, this is the antidote — there’s nothing opaque here.

Token efficiency matters for your session length. If you’re running long sessions with complex codebases, the difference between 33% of context consumed by idle MCP servers versus Skills that load on-demand is the difference between hitting context limits mid-task or not. This compounds over long sessions.

You need to integrate with external services. You have a GitHub repo, a production database, or monitoring alerts that Claude needs to read. MCP handles this cleanly. Trying to solve it via Skills bash scripts creates fragile workarounds that break when APIs change.

When This Stack Does Not Fit

You need audit trails for agent actions. Skills leave no structured log of what was invoked, when, or by whom. If your organization requires audit trails for AI-assisted work — compliance, security review, team accountability — MCP’s structured tool invocation logging is what you need. Skills won’t satisfy this requirement, and you shouldn’t try to bolt logging onto them.

You’re building tooling for multiple AI clients. If the same workflow needs to run in Cursor, Claude Desktop, and Claude Code, encode it as an MCP server or prompt, not a Skill. Skills are Claude Code-native. MCP is the cross-platform layer. For teams building developer tooling rather than using it, MCP’s portability is the entire point.

Your workflows are genuinely undiscovered. Skills work best for known procedures you can articulate. If you’re in early exploration mode — unsure what your workflow even is — the 10,000+ MCP server catalog is actually valuable for tool discovery. You can try tools without writing Skills for them first. Use MCP to discover, then encode what you find into Skills once patterns emerge.

You’re replacing CI/CD. Neither Skills nor MCP is a deployment pipeline. /ship with disable-model-invocation: true can orchestrate a release checklist, but it is not GitHub Actions. It has no retry logic, no structured failure handling, and no audit history. Use proper CI/CD for production deployments; use Skills to prepare for and coordinate them.

The Honest Architecture

Simon Willison’s observation that Skills are “maybe a bigger deal than MCP” points at something real: the productivity gains in Claude Code setups come from workflow discipline, not protocol complexity. Skills encode that discipline in a format Claude can actually use — readable, lazily loaded, semantically indexed.

But MCP’s value isn’t protocol elegance. It’s that someone else already built the GitHub integration, the Postgres connector, the Sentry bridge. You don’t write glue code. You install a server, configure credentials, and Claude can query your production database. That’s genuinely hard to replicate with Skills bash scripts without rebuilding what MCP servers already provide.

The pattern that works in practice: Skills as your workflow layer, MCP as your data gateway. CLAUDE.md stays thin — project identity, hard constraints, a routing table pointing at Skills. Skills hold the procedural knowledge: how to plan a feature, how to review code, how to run QA, how to ship. MCP handles the external calls: read this GitHub PR, query this database, fetch this Sentry error.

What neither layer does: replace the domain expertise that makes a workflow worth encoding in the first place. gstack is compelling not because it’s prompts in text files but because Garry Tan’s opinions about software engineering are worth encoding. Your Skills are only as good as the workflows they capture. That part is still on you.

The MCP-maximalist position — install everything, route everything through the protocol — creates setups where Claude spends a third of its context budget on tool definitions it won’t use this session. The Skills-first position with surgical MCP usage spends that budget on the actual work. For solo devs and small teams grinding features, that trade is obvious. Make it deliberately rather than by default.

By dennis · Apr 4, 2026 ← all stacks