mirror of https://github.com/garrytan/gstack.git synced 2026-05-02 03:35:09 +02:00

Files

T

Garry Tan 9d47619e4c feat: Completeness Principle — Boil the Lake (v0.6.1) (#140 )

* feat: Completeness Principle — Boil the Lake (WIP, pre-merge)

Add Completeness Principle to all skill preambles, dual-time estimates,
compression table, anti-pattern gallery, Lake Score, and completeness
gaps review category. VERSION/CHANGELOG will be rebased after merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update stale version reference in TODOS.md (v0.5.3 → v0.6.1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CHANGELOG date + README for v0.6.1 features

- Add date to CHANGELOG 0.6.1 entry
- Add Completeness Principle to README intro
- Add SELECTIVE EXPANSION mode to CEO review section
- Add test bootstrap mention to /ship section
- Fix uninstall command missing design-consultation in project uninstall
- Add "recommends shortcuts" and "no tests" to Without gstack list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: split README into lean intro + docs/ directory (gh CLI pattern)

README: 875 → 243 lines. Keeps intro, skill table, demo, install, and
troubleshooting. All per-skill deep dives, Greptile integration guide,
and contributor mode docs moved to docs/ directory.

- docs/skills.md — full philosophy and examples for all 13 skills
- docs/greptile.md — Greptile setup and triage workflow
- docs/contributor-mode.md — how to enable and use contributor mode
- README now links to docs/ via Documentation table
- Updated skill table entries with latest features (fix-first, regression
  tests, test health, completeness gaps)
- Updated demo transcript with AUTO-FIXED, coverage audit, regression test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove "competitor" language, rewrite README in Garry's voice

Replace "browses competitors" with "knows the landscape" / "what's out
there" throughout all user-facing copy. Trim README from 243 to 167
lines — tighter, more opinionated, less listicle energy. Remove
Completeness Principle from README top (it lives in CLAUDE.md and the
skill preambles where Claude actually reads it).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: rewrite README in Garry's raw voice — AGI era, L8 factory, real stories

The README now sounds like Garry, not a product page. Leads with the
live experiment, the 16k LOC/day reality, the real-life coding stories
(Austin, hospital bedside). Highlights the newest unlocks (design at
the heart, /qa parallelism, smart review routing, test bootstrap).
Closes with an open invitation — free MIT, fork it, let's all ride
the wave together.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Garry's bonafides to README intro — Palantir, Posterous, YC, 600k LOC

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add real /retro numbers — 140k lines, 362 commits across 3 projects

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add "in the last 60 days" timeframe to 600k LOC claim

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add GitHub contribution graphs — 2026 vs 2013 side by side

Same person, different era. 2013: 772 contributions building Bookface.
2026: 1,237 contributions and accelerating. The difference is the tooling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify /retro stats are from last 7 days

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add designer/PM/eng manager roles to intro

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove Josh/L8 reference from README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: move demo up, make it dramatically more impressive

Show the actual architecture diagram, auto-fixed issues, 100% coverage,
regression test generation. Punch line: "That is not a copilot. That is
a team."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove "My journey" section — intro already covers it

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: prefix all skill commands with You: in demo transcript

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: collapse You/Claude lines in demo — no gap between command and response

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify plan mode flow in demo — approve, exit, Claude implements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: move /ship to end of demo — review → QA → ship is the real flow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add /plan-design-review to demo, tighten CEO response

Shorter CEO reply, compressed eng diagram, added design audit with
AI Slop score. Seven commands now: plan → eng → build → design →
review → QA → ship.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: move design review before implementation — it's part of planning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: reorder demo — design before eng, after CEO

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove URL from /plan-design-review in demo

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add [...] annotations showing what actually happens at each step

Each step now shows what the agent does under the hood: 8 expansion
proposals cherry-picked, 80-item design audit, ASCII diagrams for
every flow, 2400 lines written in 8 minutes, real browser QA, bug
found and fixed. Makes the demo feel real, not abstract.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: rename Contributor Mode to How to Contribute in docs table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Coinbase, Instacart, Rippling to YC bonafides

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add "one or two people in a garage" to founder story

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to top of skills.md with anchor links

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: consolidate — roll contributor-mode into CONTRIBUTING, greptile into skills

- docs/contributor-mode.md → merged into CONTRIBUTING.md (session awareness section)
- docs/greptile.md → merged into docs/skills.md (Greptile integration section)
- Reordered docs table: Skills > Architecture > Browser > Contributing > Changelog

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-17 16:34:08 -05:00

13 KiB

Raw Permalink Blame History

gstack

Hi, I'm Garry Tan. I'm President & CEO of Y Combinator, where I've worked with thousands of startups including Coinbase, Instacart, and Rippling when the founders were just one or two people in a garage — companies now worth tens of billions of dollars. Before YC, I designed the Palantir logo and was one of the first eng manager/PM/designers there. I cofounded Posterous, a blog platform we sold to Twitter. I built Bookface, YC's internal social network, back in 2013. I've been building products as a designer, PM, and eng manager for a long time.

And right now I am in the middle of something that feels like a new era entirely.

In the last 60 days I have written over 600,000 lines of production code — 35% tests — and I am doing 10,000 to 20,000 usable lines of code per day as a part-time part of my day while doing all my duties as CEO of YC. That is not a typo. My last /retro (developer stats from the last 7 days) across 3 projects: 140,751 lines added, 362 commits, ~115k net LOC. The models are getting dramatically better every week. We are at the dawn of something real — one person shipping at a scale that used to require a team of twenty.

2026 — 1,237 contributions and counting:

2013 — when I built Bookface at YC (772 contributions):

Same person. Different era. The difference is the tooling.

gstack is how I do it. It is my open source software factory. It turns Claude Code into a virtual engineering team you actually manage — a CEO who rethinks the product, an eng manager who locks the architecture, a designer who catches AI slop, a paranoid reviewer who finds production bugs, a QA lead who opens a real browser and clicks through your app, and a release engineer who ships the PR. Thirteen specialists, all as slash commands, all Markdown, all free, MIT license, available right now.

I am learning how to get to the edge of what agentic systems can do as of March 2026, and this is my live experiment. I am sharing it because I want the whole world on this journey with me.

Fork it. Improve it. Make it yours. Don't player hate, appreciate.

See it work

You:    I want to add photo upload for sellers.
You:    /plan-ceo-review
Claude: "Photo upload" is not the feature. The real job is helping
        sellers create listings that actually sell. What if we
        auto-identify the product, pull specs and comps from the
        web, and draft the listing automatically? That's 10 stars.
        "Upload a photo" is 3 stars. Which are we building?
        [8 expansion proposals, you cherry-pick 5, defer 3 to backlog]

You:    /plan-design-review
Claude: Design Score: B  |  AI Slop Score: C
        "Upload flow looks like a default Bootstrap form."
        [80-item audit, infers your design system, exports DESIGN.md]
        [flags 3 AI slop patterns: gradient hero, icon grid, uniform radius]

You:    /plan-eng-review
Claude: ┌─────────┐    ┌──────────┐    ┌──────────┐    ┌─────────┐
        │ Upload  │───▶│ Classify │───▶│ Enrich   │───▶│ Draft   │
        │ (sync)  │    │ (async)  │    │ (async)  │    │ (async) │
        └─────────┘    └──────────┘    └──────────┘    └─────────┘
        [ASCII diagrams for every data flow, state machine, error path]
        [14-case test matrix, 6 failure modes mapped, 3 security concerns]

You:    Approve plan. Exit plan mode.
        [Claude writes 2,400 lines across 11 files — models, services,
         controllers, views, migrations, and tests. ~8 minutes.]

You:    /review
Claude: [AUTO-FIXED] Orphan S3 cleanup on failed upload
        [AUTO-FIXED] Missing index on listings.status
        [ASK] Race condition on hero image selection → You: yes
        [traces every new enum value through all switch statements]
        3 issues — 2 auto-fixed, 1 fixed.

You:    /qa https://staging.myapp.com
Claude: [opens real browser, logs in, uploads photos, clicks through flows]
        Upload → classify → enrich → draft: end to end ✓
        Mobile: ✓  |  Slow connection: ✓  |  Bad image: ✓
        [finds bug: preview doesn't clear on second upload — fixes it]
        Regression test generated.

You:    /ship
Claude: Tests: 42 → 51 (+9 new)
        Coverage: 14/14 code paths (100%)
        PR: github.com/you/app/pull/42

One feature. Seven commands. The agent reframed the product, ran an 80-item design audit, drew the architecture, wrote 2,400 lines of code, found a race condition I would have missed, auto-fixed two issues, opened a real browser to QA test, found and fixed a bug I didn't know about, wrote 9 tests, and generated a regression test. That is not a copilot. That is a team.

The team

Skill	Your specialist	What they do
`/plan-ceo-review`	CEO / Founder	Rethink the problem. Find the 10-star product hiding inside the request. Four modes: Expansion, Selective Expansion, Hold Scope, Reduction.
`/plan-eng-review`	Eng Manager	Lock in architecture, data flow, diagrams, edge cases, and tests. Forces hidden assumptions into the open.
`/plan-design-review`	Senior Designer	80-item design audit with letter grades. AI Slop detection. Infers your design system. Report only — never touches code.
`/design-consultation`	Design Partner	Build a complete design system from scratch. Knows the landscape, proposes creative risks, generates realistic product mockups. Design at the heart of all other phases.
`/review`	Staff Engineer	Find the bugs that pass CI but blow up in production. Auto-fixes the obvious ones. Flags completeness gaps.
`/ship`	Release Engineer	Sync main, run tests, audit coverage, push, open PR. Bootstraps test frameworks if you don't have one. One command.
`/browse`	QA Engineer	Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command.
`/qa`	QA Lead	Test your app, find bugs, fix them with atomic commits, re-verify. Auto-generates regression tests for every fix.
`/qa-only`	QA Reporter	Same methodology as /qa but report only. Use when you want a pure bug report without code changes.
`/qa-design-review`	Designer Who Codes	Same audit as /plan-design-review, then fixes what it finds. Atomic commits, before/after screenshots.
`/setup-browser-cookies`	Session Manager	Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages.
`/retro`	Eng Manager	Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities.
`/document-release`	Technical Writer	Update all project docs to match what you just shipped. Catches stale READMEs automatically.

Deep dives with examples and philosophy for every skill →

What's new and why it matters

Design is at the heart. /design-consultation doesn't just pick fonts. It researches what's out there in your space, proposes safe choices AND creative risks, generates realistic mockups of your actual product, and writes DESIGN.md — and then /qa-design-review and /plan-eng-review read what you chose. Design decisions flow through the whole system.

/qa was a massive unlock. It let me go from 6 to 12 parallel workers. Claude Code saying "I SEE THE ISSUE" and then actually fixing it, generating a regression test, and verifying the fix — that changed how I work. The agent has eyes now.

Smart review routing. Just like at a well-run startup: CEO doesn't have to look at infra bug fixes, design review isn't needed for backend changes. gstack tracks what reviews are run, figures out what's appropriate, and just does the smart thing. The Review Readiness Dashboard tells you where you stand before you ship.

Test everything. /ship bootstraps test frameworks from scratch if your project doesn't have one. Every /ship run produces a coverage audit. Every /qa bug fix generates a regression test. 100% test coverage is the goal — tests make vibe coding safe instead of yolo coding.

/document-release is the engineer you never had. It reads every doc file in your project, cross-references the diff, and updates everything that drifted. README, ARCHITECTURE, CONTRIBUTING, CLAUDE.md, TODOS — all kept current automatically.

10 sessions at once

gstack is powerful with one session. It is transformative with ten.

Conductor runs multiple Claude Code sessions in parallel — each in its own isolated workspace. One session running /qa on staging, another doing /review on a PR, a third implementing a feature, and seven more on other branches. All at the same time.

One person, ten parallel agents, each with the right cognitive mode. That is a different way of building software.

Install — takes 30 seconds

Requirements: Claude Code, Git, Bun v1.0+

Step 1: Install on your machine

Open Claude Code and paste this. Claude does the rest.

Install gstack: run git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp__claude-in-chrome__* tools, and lists the available skills: /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /qa-design-review, /setup-browser-cookies, /retro, /document-release. Then ask the user if they also want to add gstack to the current project so teammates get it.

Step 2: Add to your repo so teammates get it (optional)

Add gstack to this project: run cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp__claude-in-chrome__* tools, lists the available skills: /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /review, /ship, /browse, /qa, /qa-only, /qa-design-review, /setup-browser-cookies, /retro, /document-release, and tells Claude that if gstack skills aren't working, run cd .claude/skills/gstack && ./setup to build the binary and register skills.

Real files get committed to your repo (not a submodule), so git clone just works. Everything lives inside .claude/. Nothing touches your PATH or runs in the background.

+----------------------------------------------------------------------------+
|                                                                            |
|   Are you a great software engineer who wants to ship 10K+ LOC/day?       |
|                                                                            |
|   Come work at YC: ycombinator.com/software                                |
|                                                                            |
|   Extremely competitive salary and equity.                                 |
|   Now hiring in San Francisco, Dogpatch District.                          |
|   Come join the revolution.                                                |
|                                                                            |
+----------------------------------------------------------------------------+

Come ride the wave

This is free, MIT licensed, open source, available now. No premium tier. No waitlist. No strings.

I open sourced how I do development and I am actively upgrading my own software factory here. You can fork it and make it your own. That's the whole point. I want everyone on this journey.

The models are getting better fast. The people who figure out how to work with them now — really work with them, not just dabble — are going to have a massive advantage. This is that window. Let's go.

github.com/garrytan/gstack — MIT License

Docs

Doc	What it covers
Skill Deep Dives	Philosophy, examples, and workflow for every skill (includes Greptile integration)
Architecture	Design decisions and system internals
Browser Reference	Full command reference for `/browse`
Contributing	Dev setup, testing, contributor mode, and dev mode
Changelog	What's new in every version

Troubleshooting

Skill not showing up? cd ~/.claude/skills/gstack && ./setup

/browse fails? cd ~/.claude/skills/gstack && bun install && bun run build

Stale install? Run /gstack-upgrade — or set auto_upgrade: true in ~/.gstack/config.yaml

License

MIT. Free forever. Go build something.

13 KiB Raw Permalink Blame History