The Closed-Loop Dev Setup: Claude Code, MCPs, and Zero Context Rot
I run Claude Code and Open Code with 4 MCPs, custom skills, git worktrees, and spec-driven design. Here’s the setup.
The Shift
This isn’t about AI replacing developers. It’s about spending your time differently. I used to spend most of my day typing. Now I spend most of it thinking about problems and reviewing solutions. The typing is someone else’s job now.
Your role shifts from typist to architect. You’re still the engineer. You just stopped doing the mechanical part.
Voice + Keyboard
I wrote my own dictation software (Handsfree) because nothing else handled technical speech well. Now I mix both constantly: voice for intent („add a POST endpoint that validates email before inserting“), keyboard for precision (rename that variable, fix that line).
Sounds weird. Works great. Voice keeps you in thinking mode. Keyboard for surgery.
Two Agents, One Workflow
I switch between Claude Code (Anthropic’s CLI) and Open Code (open-source). Claude Code feels like a Swiss army knife. Open Code feels like a senior dev. Which one I pick depends on the task, the framework, and honestly, the vibe. Same with models and plugins – it’s subjective. Benchmarks lie. Try things.
Both connect to the same extensions:
- Plugins – workflow automation and custom commands
- MCPs (Model Context Protocol) – deep tool integrations

Plugins
GSD (Get Shit Done)
Task management and step-by-step workflow coordination. Keeps agents focused.
Superpowers
Forces a proper design phase before any code gets written. Brainstorming, spec refinement, then subagent-driven implementation. This is where the spec-driven discipline comes from. The agents follow the methodology automatically.
python-infra-audit-cc
I built this one. Type /infra:audit and it scores your Python project (0-10) across 11 areas: ruff, pyright, pre-commit, CI/CD, pyproject.toml, uv, Docker, Makefile, Alembic, env config, and dead code (vulture). /infra:fix auto-remediates findings. Ships with a Renovate blueprint for dependency freshness.
| Area | What it checks |
|---|---|
| Ruff | Linting rules, security (S), import sorting |
| Pyright | Type checking, Python version match |
| Pre-commit | Hook presence, ruff + format hooks |
| CI/CD | Lint/test/format jobs, triggers |
| Docker | SHA256-pinned images, frozen installs |
| Dead code | Unused functions/imports via vulture |
| + pyproject, uv, Makefile, Alembic, env config | |
The point: if you have a repetitive engineering task, encode it as a skill. The agent understands the why, not just the pattern.
MCPs: Closing the Loop
MCPs are local servers that give agents real capabilities. This is how you go from „write code“ to „verify it works.“
Playwright
Claude writes a component, Playwright renders it in Chromium, Claude sees the result, iterates. No alt-tabbing. „Make that button wider“ goes from 2 minutes of context switching to 10 seconds.
Context7
Pulls current library docs instead of training-data-vintage guesses. Not magic – you have to tell it which library to check – but when you do, you stop getting deprecated APIs suggested with confidence.
Postgres + SQLite
Claude queries your dev database directly. Schema inspection, data shape checks, transform verification. Everything is git-committed, so the database is disposable. „Pull the last 100 orders, show the payment distribution, then normalize the field“ – done without writing SQL manually.
Bubblewrap: Skip the Yes/Yes/Yes
Permission prompts are responsible design. They’re also death by a thousand clicks when the task is well-specced.
I run agents inside bubblewrap scripts with --dangerously-skip-permissions. Filesystem is restricted to the dev directory. Docker access exists (theoretical escape via volume mounts, but you’d need very targeted prompt injection and I’d catch it in review). Git push works. The attack surface is narrow.
The workflow flips: spec thoroughly upfront, let the agent run autonomously, review the outcome. Quality gate moves from „approve each step“ to „approve the result.“ I kick off three features in parallel worktrees, make coffee, come back to review.
Git Worktrees
Separate directories per branch instead of stash/unstash gymnastics:
~/project/main # main branch
~/project/feature-a # feature A
~/project/feature-b # feature B
Each worktree gets its own agent session. Parallel development, no context loss, isolated test state.
The Cycle
- Spec the feature in markdown (or write a failing test for small stuff)
- Agent runs bubblewrapped in a worktree
- MCPs verify – Playwright for UI, Postgres for data, Context7 for docs
- Review the diff, run tests, apply judgment
- Merge
Spec first, review last. Describe outcomes, not implementation details. The agent figures out the how.
Cost
Claude Max tops out at ~200 EUR/month. But Open Code with Kimi K2.5 or GLM5 via OpenRouter gets you started for 20-30 EUR. With good specs, cheaper models deliver surprisingly well. I switch between tools and models based on preference, not price.
What Went Wrong (And What I Built Because of It)
„Everything works!“ (It didn’t.) Claude reports success, unit tests pass, but the actual user flow is broken. Hence: Playwright. „Works“ means a real browser can complete the journey.
Outdated everything. Deprecated APIs, stale patterns, ancient versions. Claude codes against its training data. Hence: Context7 + Renovate + infra audit.
The „it compiles“ trap. A transform runs clean but silently drops 30% of records. Hence: Postgres/SQLite MCPs. Verify the output, not the syntax.
The harness is scar tissue. Every rule in python-infra-audit-cc exists because I shipped broken code without it.
Setup takes about a day if you know what you’re doing. Getting here took a year, starting with Cursor, evolving piece by piece. The stack keeps changing. For novel algorithm work, I still go manual. But the guardrails enforce themselves now.
Bottom Line
The bottleneck was never typing speed. It was always thinking, designing, verifying, and reviewing. AI accelerates the iteration cycle for all four. I ship better code, faster, with higher confidence. Not because I type faster – because I think more.
Gernot Greimler is a data engineer and the author of python-infra-audit-cc. He builds data pipelines, infrastructure automation, and AI-augmented dev workflows at dataprospectors.at.
Questions about the stack? Reach out on LinkedIn.