merge: resolve conflicts with origin/main (v0.9.0.1 → v0.9.1)

Integrated dynamic template discovery, Codex host support, telemetry,
and plan-mode persistence from main. Generated Codex variants for
canary, benchmark, and land-and-deploy skills.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-20 06:25:02 -07:00
107 changed files with 17791 additions and 370 deletions
+72 -9
View File
@@ -34,8 +34,15 @@ echo "BRANCH: $_BRANCH"
echo "PROACTIVE: $_PROACTIVE"
_LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no")
echo "LAKE_INTRO: $_LAKE_SEEN"
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || true)
_TEL_PROMPTED=$([ -f ~/.gstack/.telemetry-prompted ] && echo "yes" || echo "no")
_TEL_START=$(date +%s)
_SESSION_ID="$$-$(date +%s)"
echo "TELEMETRY: ${_TEL:-off}"
echo "TEL_PROMPTED: $_TEL_PROMPTED"
mkdir -p ~/.gstack/analytics
echo '{"skill":"review","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
for _PF in ~/.gstack/analytics/.pending-*; do [ -f "$_PF" ] && ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true; break; done
```
If `PROACTIVE` is `"false"`, do not proactively suggest gstack skills — only invoke
@@ -55,6 +62,39 @@ touch ~/.gstack/.completeness-intro-seen
Only run `open` if the user says yes. Always run `touch` to mark as seen. This only happens once.
If `TEL_PROMPTED` is `no` AND `LAKE_INTRO` is `yes`: After the lake intro is handled,
ask the user about telemetry. Use AskUserQuestion:
> Help gstack get better! Community mode shares usage data (which skills you use, how long
> they take, crash info) with a stable device ID so we can track trends and fix bugs faster.
> No code, file paths, or repo names are ever sent.
> Change anytime with `gstack-config set telemetry off`.
Options:
- A) Help gstack get better! (recommended)
- B) No thanks
If A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry community`
If B: ask a follow-up AskUserQuestion:
> How about anonymous mode? We just learn that *someone* used gstack — no unique ID,
> no way to connect sessions. Just a counter that helps us know if anyone's out there.
Options:
- A) Sure, anonymous is fine
- B) No thanks, fully off
If B→A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry anonymous`
If B→B: run `~/.claude/skills/gstack/bin/gstack-config set telemetry off`
Always run:
```bash
touch ~/.gstack/.telemetry-prompted
```
This only happens once. If `TEL_PROMPTED` is `yes`, skip this entirely.
## AskUserQuestion Format
**ALWAYS follow this structure for every AskUserQuestion call:**
@@ -154,6 +194,34 @@ ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next]
```
## Telemetry (run last)
After the skill workflow completes (success, error, or abort), log the telemetry event.
Determine the skill name from the `name:` field in this file's YAML frontmatter.
Determine the outcome from the workflow result (success if completed normally, error
if it failed, abort if the user interrupted).
**PLAN MODE EXCEPTION — ALWAYS RUN:** This command writes telemetry to
`~/.gstack/analytics/` (user config directory, not project files). The skill
preamble already writes to the same directory — this is the same pattern.
Skipping this command loses session duration and outcome data.
Run this bash:
```bash
_TEL_END=$(date +%s)
_TEL_DUR=$(( _TEL_END - _TEL_START ))
rm -f ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-telemetry-log \
--skill "SKILL_NAME" --duration "$_TEL_DUR" --outcome "OUTCOME" \
--used-browse "USED_BROWSE" --session-id "$_SESSION_ID" 2>/dev/null &
```
Replace `SKILL_NAME` with the actual skill name from frontmatter, `OUTCOME` with
success/error/abort, and `USED_BROWSE` with true/false based on whether `$B` was used.
If you cannot determine the outcome, use "unknown". This runs in the background and
never blocks the user.
## Step 0: Detect base branch
Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps.
@@ -271,7 +339,7 @@ Follow the output format specified in the checklist. Respect the suppressions
Check if the diff touches frontend files using `gstack-diff-scope`:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
```
**If `SCOPE_FRONTEND=false`:** Skip design review silently. No output.
@@ -294,12 +362,10 @@ eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
6. **Log the result** for the Review Readiness Dashboard:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
mkdir -p ~/.gstack/projects/$SLUG
echo '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-review-lite","timestamp":"TIMESTAMP","status":"STATUS","findings":N,"auto_fixed":M,"commit":"COMMIT"}'
```
Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "issues_found", N = total findings, M = auto-fixed count.
Substitute: TIMESTAMP = ISO 8601 datetime, STATUS = "clean" if 0 findings or "issues_found", N = total findings, M = auto-fixed count, COMMIT = output of `git rev-parse --short HEAD`.
Include any design findings alongside the findings from Step 4. They follow the same Fix-First flow in Step 5 — AUTO-FIX for mechanical CSS fixes, ASK for everything else.
@@ -453,10 +519,7 @@ Present the full output verbatim under a `CODEX SAYS (adversarial challenge):` h
**Only if a code review ran (user chose A or C):** Persist the Codex review result to the review log:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
mkdir -p ~/.gstack/projects/"$SLUG"
echo '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}' >> ~/.gstack/projects/"$SLUG"/"$BRANCH_SLUG"-reviews.jsonl
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}'
```
Substitute: STATUS ("clean" if PASS, "issues_found" if FAIL), GATE ("pass" or "fail").
+1 -4
View File
@@ -267,10 +267,7 @@ Present the full output verbatim under a `CODEX SAYS (adversarial challenge):` h
**Only if a code review ran (user chose A or C):** Persist the Codex review result to the review log:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
BRANCH_SLUG=$(git rev-parse --abbrev-ref HEAD 2>/dev/null | tr '/' '-')
mkdir -p ~/.gstack/projects/"$SLUG"
echo '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}' >> ~/.gstack/projects/"$SLUG"/"$BRANCH_SLUG"-reviews.jsonl
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","gate":"GATE"}'
```
Substitute: STATUS ("clean" if PASS, "issues_found" if FAIL), GATE ("pass" or "fail").
+7 -7
View File
@@ -35,16 +35,16 @@ Be terse. For each issue: one line describing the problem, one line with the fix
### Pass 1 — CRITICAL
#### SQL & Data Safety
- String interpolation in SQL (even if values are `.to_i`/`.to_f` — use `sanitize_sql_array` or Arel)
- String interpolation in SQL (even if values are `.to_i`/`.to_f` — use parameterized queries (Rails: sanitize_sql_array/Arel; Node: prepared statements; Python: parameterized queries))
- TOCTOU races: check-then-set patterns that should be atomic `WHERE` + `update_all`
- `update_column`/`update_columns` bypassing validations on fields that have or should have constraints
- N+1 queries: `.includes()` missing for associations used in loops/views (especially avatar, attachments)
- Bypassing model validations for direct DB writes (Rails: update_column; Django: QuerySet.update(); Prisma: raw queries)
- N+1 queries: Missing eager loading (Rails: .includes(); SQLAlchemy: joinedload(); Prisma: include) for associations used in loops/views
#### Race Conditions & Concurrency
- Read-check-write without uniqueness constraint or `rescue RecordNotUnique; retry` (e.g., `where(hash:).first` then `save!` without handling concurrent insert)
- `find_or_create_by` on columns without unique DB index — concurrent calls can create duplicates
- Read-check-write without uniqueness constraint or catch duplicate key error and retry (e.g., `where(hash:).first` then `save!` without handling concurrent insert)
- find-or-create without unique DB index — concurrent calls can create duplicates
- Status transitions that don't use atomic `WHERE old_status = ? UPDATE SET new_status` — concurrent updates can skip or double-apply transitions
- `html_safe` on user-controlled data (XSS) — check any `.html_safe`, `raw()`, or string interpolation into `html_safe` output
- Unsafe HTML rendering (Rails: .html_safe/raw(); React: dangerouslySetInnerHTML; Vue: v-html; Django: |safe/mark_safe) on user-controlled data (XSS)
#### LLM Output Trust Boundary
- LLM-generated values (emails, URLs, names) written to DB or passed to mailers without format validation. Add lightweight guards (`EMAIL_REGEXP`, `URI.parse`, `.strip`) before persisting.
@@ -159,7 +159,7 @@ the agent auto-fixes a finding or asks the user.
```
AUTO-FIX (agent fixes without asking): ASK (needs human judgment):
├─ Dead code / unused variables ├─ Security (auth, XSS, injection)
├─ N+1 queries (missing .includes()) ├─ Race conditions
├─ N+1 queries (missing eager loading) ├─ Race conditions
├─ Stale comments contradicting code ├─ Design decisions
├─ Magic numbers → named constants ├─ Large fixes (>20 lines)
├─ Missing LLM output validation ├─ Enum completeness
+1 -1
View File
@@ -9,7 +9,7 @@ This checklist applies to **source code in the diff** — not rendered output. R
**Trigger:** Only run this checklist if the diff touches frontend files. Use `gstack-diff-scope` to detect:
```bash
eval $(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)
```
If `SCOPE_FRONTEND=false`, skip the entire design review silently.