The Stack
Everything running this site and the vibe coding setup behind it. Three AI models, a voice extractor GPT, five MCP servers, and a lot of git pushes.
Stack Version: v11 | Last Updated: 2025-12-16
Okay so this site is an experiment in AI-assisted development. The stack is part of the experiment. If I were reading this, I’d want to know what’s actually running it.
Here’s everything. And I mean everything.
Claude Code + VS Code
This is the core. Everything else is secondary.
I started with Claude Code in a terminal window. Pure experience: text in, text out, code appears. But I couldn’t see what was happening. Files would change and I wouldn’t know until I committed. Slash commands existed but I wasn’t skilled enough to use them.
Then I tried VS Code.
Same Claude Code, but now I can see the work. Watch files update in real time. Review changes before committing. The Explorer panel shows what Claude modified. The Source Control panel shows what’s staged. I went from “hope this is right” to “I can see exactly what happened.”
Learning happened faster. When you can see the patterns Claude uses (how it structures files, names functions, handles errors) you absorb them. Terminal-only Claude Code is powerful. VS Code with Claude Code is powerful and educational.
I still prefer terminal for the actual interaction. VS Code gives me both: terminal feel, visual feedback.
Voice Input: Wispr Flow
Wispr Flow is how I talk to Claude.
Not metaphorically. Literally. I speak, Wispr transcribes, Claude receives text. The latency is negligible. The accuracy is better than my typing. And the speed difference is dramatic. I can explain what I want faster than I can type it.
This matters more than it sounds. When you can describe problems at the speed of thought, you iterate faster. You explore more options. You don’t self-edit while typing because typing is slow. Voice removes the bottleneck between thinking and prompting.
The integration works everywhere. Wispr runs in the background, I hit a hotkey, I talk, text appears wherever my cursor is. Terminal, VS Code, browser, everywhere. No copy-paste, no switching apps.
I can’t imagine going back to typing prompts.
Three AI Models (Plus One GPT)
Not redundant. Complementary.
Primary Stack
- Claude Opus 4.5: design, implement, reason (via Claude Code)
- Claude Haiku 3.5: production API, cost-optimized
Second Opinions
- Gemini 3 Pro: code review, debugging, alternative approaches
- GPT-5.2: deep analysis, multi-step reasoning (via PAL MCP)
The multi-model setup is the interesting part. When I’m stuck on a bug or want a sanity check, I can ask a different model. Claude has blind spots. Gemini has different blind spots. GPT-5.2 has different ones still. Three reviewers catch more than one.
This isn’t about which model is “better.” It’s about creating a review loop. Claude writes, Gemini or GPT reviews, Claude iterates. The PAL MCP adds something new: multi-model consensus. I can ask all three to analyze the same code and see where they agree or disagree.
Style Extractor GPT
Here’s a weird one. I use a custom GPT called Style Extractor to capture writing voice.
You feed it transcripts or writing samples. It spits out a voice profile: the specific patterns, rhythms, word choices, quirks that make someone sound like themselves. I used it on my own rambling Zoom calls and got back a profile that now lives in my blog voice guide.
So now when I write with Claude, I can paste in that profile and say “write like this.” The output sounds more like me talking than like AI-generated content. The Slop Detector catches when the voice drifts. Style Extractor helps me define what “my voice” even means.
It’s a weird loop: GPT extracts my voice, Claude writes in my voice, my own tool checks if it sounds like AI. Kind of meta.
BMAD: Beyond Project Planning
BMAD is a workflow framework. Tech specs, architecture decisions, user stories. I use it when projects have multiple moving parts.
But the useful parts extend beyond project work.
Party Mode puts multiple AI personas in conversation. I’ve used it to brainstorm product ideas, debug business problems, stress-test strategies. Five different perspectives arguing with each other catches blind spots a single model misses.
The Business Analyst persona is good for thinking through problems that aren’t code. Market positioning. Pricing strategy. User research framing. Having a structured way to think through business questions, with specific prompts and formats, makes the AI output dramatically better than freeform “help me think about X.”
The personas are the interesting part. I learned most of what I know about personas from Stunspot, whose work on persona prompting and skillgraphs shaped how I think about giving LLMs structure. When you give an LLM a specific role with defined expertise, constraints, and communication style, the output quality jumps. It’s not just prompting. It’s context engineering. There’s a whole post coming on why personas work and how to build good ones.
The V2 planning for the Slop Detector used BMAD heavily. But I’ve also used it for client work, side projects, and general “I need to think clearly about something complicated.”
MCP: The Tool Network
MCP is Model Context Protocol. It lets Claude Code talk to external services without leaving the terminal.
I run everything through a gateway called OneMCP. Without it, each service dumps 50+ tool definitions into context. That’s a problem: every tool definition eats into your context window. Load five MCP servers directly, and you’ve burned thousands of tokens before the conversation starts. OneMCP acts as a proxy. Claude searches for what it needs, only relevant tools load. The context stays clean.
Active Servers (via OneMCP)
- Memory: persistent knowledge graph across sessions
- Gemini: second-opinion AI integration
- PAL: Provider Abstraction Layer for GPT-5.2, multi-model consensus, code analysis
- Bright Data: web scraping and research enrichment
Custom APIs (not MCP)
- Google Drive: full CRUD, built custom for specific auth and data needs
- HubSpot: full CRUD, built custom for client work
Most of these aren’t used on this site. But they’re part of the broader setup. When I’m doing client work, I can pull CRM data, update documents, and write code all from the same terminal session.
The Memory server maintains context across sessions. When I start a new conversation, relevant context from previous work is already there. There’s a whole post coming on memory, context windows, and how to make Claude remember things it shouldn’t be able to remember.
Slop Detector
The Slop Detector is the tool I built on this site. But it’s also part of my workflow now.
Before publishing any blog post, I run the draft through it. Two axes: Origin (is it AI?) and Quality (is it good?). The goal is “Polished AI”: AI patterns present, but well-executed. If Origin scores too high with Quality too low, that’s classic slop. Rewrite.
The “Going Too Fast” post scored Quality 100, Origin 93. That’s the target. You’re reading AI-assisted writing that doesn’t read like AI-assisted writing.
The Website
Core Framework
- Astro 5: ships zero JavaScript by default
- React 19: only for interactive components
- Tailwind CSS 4: utility classes, no CSS thinking
- MDX: markdown with components
- TypeScript (strict)
Infrastructure
- Vercel: git push deploys, serverless functions
- Supabase: PostgreSQL without the ops overhead
- Resend: transactional email
- Claude API (Haiku 3.5): detection endpoint
Astro was the right call. The Slop Detector gets React. The blog posts get static HTML. Lighthouse scores: 100 across the board.
Supabase over Vercel KV was a deliberate choice. When I needed persistent state for budget limits, Claude suggested adding Vercel KV. I pushed back: we already have Supabase, why add another dependency? The extra latency is fine. No new vendors to manage.
The real reason for Supabase is extensibility. PostgreSQL means I can do anything a real database does. The community is massive: when I hit edge cases, someone has already solved them. And when the site grows, I’m not locked into a toy database that can’t scale.
Deployment
Active
- Vercel: frontend, serverless functions, everything right now
In Toolkit (Not Deployed Here)
Right now, Vercel handles everything. The site is server-rendered, API endpoints are serverless functions, deploys happen on git push.
Railway deserves a mention even though it’s not running this site. I’ve used it for other projects: connecting databases, running background workers, deploying services that need to stay alive. For someone who doesn’t know backend infrastructure (me), Railway makes the hard parts trivial. Click a few buttons, connect your repo, it works. When this site needs persistent processes, Railway is the obvious choice.
n8n is for workflow automation: scheduled jobs, webhooks, data pipelines. Not needed yet, but ready when I need it.
What This Means
The stack is deliberately heavy on AI tooling and light on traditional infrastructure.
I don’t have a local development server running most of the time. I don’t have a staging environment. I don’t have CI/CD pipelines or test suites. The feedback loop is: write code with Claude, push to Vercel, see if it works in production.
The thesis is that AI assistance makes traditional infrastructure less necessary. You move faster when the AI catches errors before they ship. You can skip staging when deploys are instant and rollbacks are one click.
The counter-thesis is that I’m building technical debt and will regret it later.
Both might be true.
This post was written in Claude Code. The stack description is accurate as of today. Tomorrow it might be different.
v11: Added Style Extractor GPT, rewrote in new voice, killed 6 em dashes. All iterations preserved in docs/drafts/archive/.