mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-14 00:12:12 +02:00
v1.33.1.0 fix(learnings): token-OR query + task-shaped retrieval in 3 long skills (#1442)
* fix(learnings): use token-OR matching in gstack-learnings-search --query
Split the query on whitespace into tokens; a learning matches if ANY
token appears as a substring in ANY of key/insight/files. Previously
the whole query was a single substring, so multi-word queries like
"debug investigation" only matched learnings whose insight contained
that exact contiguous phrase, which is usually nothing.
Whitespace-only query falls through to no-query (matches today's no-flag
behavior). Single-word queries behave exactly as before.
Adds test/gstack-learnings-search.test.ts: 3 assertions covering
multi-token, single-token, and no-query backwards compat.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(resolver): parameterized LEARNINGS_SEARCH with shell-injection guard
The {{LEARNINGS_SEARCH}} macro now accepts a query=KEYWORD argument that
gets interpolated as --query "<keyword>" into the generated bash. Empty
value falls through to no-query (principle of least surprise: a stray
{{LEARNINGS_SEARCH:query=}} placeholder gets today's behavior, not a
build failure). Pattern reuses the parameterized-macro parsing from
composition.ts. The 13 templates that don't pass a query stay
byte-identical in their generated SKILL.md output.
Shell-injection guard: the query value is whitelisted to
^[A-Za-z0-9 _-]+$ at gen-skill-docs time. Any \$(), backticks,
semicolons, or quotes throw a loud build error instead of emitting
executable bash. Static template queries are safe by inspection;
this defends against future contributors writing dangerous values.
Adds 5 assertions to test/gen-skill-docs.test.ts covering no-args,
claude+query=foo bar on both cross-project and project-scoped branches,
codex host variant, empty value semantics, and shell-injection payloads
(\$(whoami), backticks, ;, &, ", \\, \$x) throwing build errors.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(skills): task-shaped queries + mid-flow refresh in /investigate /qa /ship
The three long skills now pull learnings keyed to their theme at the
top, then re-pull at phase boundaries as work shifts to new sub-tasks.
Top-of-skill queries (5-6 token unions, token-OR matched):
- investigate: "debug investigation root cause hypothesis bug fix"
- qa: "qa testing bug regression flake fixture"
- ship: "release ship version changelog merge pr"
Mid-flow refresh blocks (concrete keyword recipe + worked examples):
- investigate: between Phase 1 (hypothesis) and Phase 2 (analysis),
keyed to the hypothesis noun. Examples: auth-cookie, session-expiry.
- qa: between Phase 7 (triage) and Phase 8 (fix loop), keyed to the
buggy component name. Examples: checkout-button, signup-form.
- ship: just before Step 12 (VERSION bump), keyed to the headline
feature. Examples: learnings-search, pacing, worktree-ship.
Keyword recipe enforces alphanumeric+hyphen only (no quotes, slashes,
dots, colons) so dynamic queries cannot inject shell metacharacters.
The other 13 short-lived skills keep the bare {{LEARNINGS_SEARCH}} form.
Backwards-compat verified via diff: their generated SKILL.md output is
byte-identical to before this change.
Golden ship fixtures regenerated to match the new ship/SKILL.md output.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore: bump version and changelog (v1.33.1.0)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test: refresh codex+factory ship golden fixtures
Follow-up to 513c9660 — the codex and factory host outputs needed
regeneration too, missed in the initial commit because gen:skill-docs
was only run for the claude host. Now matches gen:skill-docs --host all.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
+16
-2
@@ -1824,9 +1824,9 @@ Search for relevant learnings from previous sessions:
|
||||
_CROSS_PROJ=$(~/.claude/skills/gstack/bin/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
|
||||
echo "CROSS_PROJECT: $_CROSS_PROJ"
|
||||
if [ "$_CROSS_PROJ" = "true" ]; then
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" --cross-project 2>/dev/null || true
|
||||
else
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 2>/dev/null || true
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" 2>/dev/null || true
|
||||
fi
|
||||
```
|
||||
|
||||
@@ -2459,6 +2459,20 @@ already knows. A good test: would this insight save time in a future session? If
|
||||
|
||||
|
||||
|
||||
### Refresh learnings for the headline feature on this branch
|
||||
|
||||
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
|
||||
|
||||
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
|
||||
|
||||
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
|
||||
|
||||
```bash
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
|
||||
```
|
||||
|
||||
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
|
||||
|
||||
## Step 12: Version bump (auto-decide)
|
||||
|
||||
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
|
||||
|
||||
+36
-1
@@ -310,6 +310,26 @@ Effort both-scales: when an option involves effort, label both human-team and CC
|
||||
|
||||
Net line closes the tradeoff. Per-skill instructions may add stricter rules.
|
||||
|
||||
12. **Non-ASCII characters — write directly, never \u-escape.** When any
|
||||
string field (question, option label, option description) contains
|
||||
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
|
||||
the literal UTF-8 characters in the JSON string. **Never escape them
|
||||
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
|
||||
and passes characters through unchanged. Manually escaping requires
|
||||
recalling each codepoint from training, which is unreliable for long
|
||||
CJK strings — the model regularly emits the wrong codepoint (e.g.
|
||||
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
|
||||
actually , so the user sees `管理工具` rendered as `3用箱`).
|
||||
The trigger is long, multi-line questions with hundreds of CJK
|
||||
characters: that is exactly when reflexive escaping kicks in and
|
||||
exactly when miscoding is most damaging. Long ≠ escape. Keep
|
||||
characters literal.
|
||||
|
||||
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
|
||||
Right: `"question": "請選擇管理工具"`
|
||||
|
||||
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
|
||||
|
||||
### Self-check before emitting
|
||||
|
||||
Before calling AskUserQuestion, verify:
|
||||
@@ -322,6 +342,7 @@ Before calling AskUserQuestion, verify:
|
||||
- [ ] Dual-scale effort labels on effort-bearing options (human / CC)
|
||||
- [ ] Net line closes the decision
|
||||
- [ ] You are calling the tool, not writing prose
|
||||
- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped
|
||||
|
||||
|
||||
## Artifacts Sync (skill start)
|
||||
@@ -1789,7 +1810,7 @@ Add a `## Verification Results` section to the PR body (Step 19):
|
||||
Search for relevant learnings from previous sessions on this project:
|
||||
|
||||
```bash
|
||||
$GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true
|
||||
$GSTACK_BIN/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" 2>/dev/null || true
|
||||
```
|
||||
|
||||
If learnings are found, incorporate them into your analysis. When a review finding
|
||||
@@ -2053,6 +2074,20 @@ already knows. A good test: would this insight save time in a future session? If
|
||||
|
||||
|
||||
|
||||
### Refresh learnings for the headline feature on this branch
|
||||
|
||||
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
|
||||
|
||||
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
|
||||
|
||||
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
|
||||
|
||||
```bash
|
||||
$GSTACK_ROOT/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
|
||||
```
|
||||
|
||||
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
|
||||
|
||||
## Step 12: Version bump (auto-decide)
|
||||
|
||||
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
|
||||
|
||||
+37
-2
@@ -312,6 +312,26 @@ Effort both-scales: when an option involves effort, label both human-team and CC
|
||||
|
||||
Net line closes the tradeoff. Per-skill instructions may add stricter rules.
|
||||
|
||||
12. **Non-ASCII characters — write directly, never \u-escape.** When any
|
||||
string field (question, option label, option description) contains
|
||||
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
|
||||
the literal UTF-8 characters in the JSON string. **Never escape them
|
||||
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
|
||||
and passes characters through unchanged. Manually escaping requires
|
||||
recalling each codepoint from training, which is unreliable for long
|
||||
CJK strings — the model regularly emits the wrong codepoint (e.g.
|
||||
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
|
||||
actually , so the user sees `管理工具` rendered as `3用箱`).
|
||||
The trigger is long, multi-line questions with hundreds of CJK
|
||||
characters: that is exactly when reflexive escaping kicks in and
|
||||
exactly when miscoding is most damaging. Long ≠ escape. Keep
|
||||
characters literal.
|
||||
|
||||
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
|
||||
Right: `"question": "請選擇管理工具"`
|
||||
|
||||
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
|
||||
|
||||
### Self-check before emitting
|
||||
|
||||
Before calling AskUserQuestion, verify:
|
||||
@@ -324,6 +344,7 @@ Before calling AskUserQuestion, verify:
|
||||
- [ ] Dual-scale effort labels on effort-bearing options (human / CC)
|
||||
- [ ] Net line closes the decision
|
||||
- [ ] You are calling the tool, not writing prose
|
||||
- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped
|
||||
|
||||
|
||||
## Artifacts Sync (skill start)
|
||||
@@ -1794,9 +1815,9 @@ Search for relevant learnings from previous sessions:
|
||||
_CROSS_PROJ=$($GSTACK_BIN/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
|
||||
echo "CROSS_PROJECT: $_CROSS_PROJ"
|
||||
if [ "$_CROSS_PROJ" = "true" ]; then
|
||||
$GSTACK_BIN/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
|
||||
$GSTACK_BIN/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" --cross-project 2>/dev/null || true
|
||||
else
|
||||
$GSTACK_BIN/gstack-learnings-search --limit 10 2>/dev/null || true
|
||||
$GSTACK_BIN/gstack-learnings-search --limit 10 --query "release ship version changelog merge pr" 2>/dev/null || true
|
||||
fi
|
||||
```
|
||||
|
||||
@@ -2429,6 +2450,20 @@ already knows. A good test: would this insight save time in a future session? If
|
||||
|
||||
|
||||
|
||||
### Refresh learnings for the headline feature on this branch
|
||||
|
||||
The top-of-skill learnings pull was keyed to "release ship" broadly. Before the VERSION/CHANGELOG step, re-pull learnings keyed to THIS branch's headline feature so any prior version-bump or CHANGELOG pitfalls for similar features surface.
|
||||
|
||||
Pick ONE keyword that names the headline feature you're shipping. The keyword should be a noun: the primary skill or module name, the central feature noun, or the binary you changed. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
|
||||
|
||||
Worked examples (ship-specific): good keywords are `learnings-search`, `pacing`, `worktree-ship`. Bad: `the branch headline`, `v1.31.1.0`, `feat: token-or search`.
|
||||
|
||||
```bash
|
||||
$GSTACK_ROOT/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
|
||||
```
|
||||
|
||||
If any learnings come back, name which one applies to the version bump or CHANGELOG framing in one sentence. If none come back, continue without reference — the absence is itself useful information.
|
||||
|
||||
## Step 12: Version bump (auto-decide)
|
||||
|
||||
**Idempotency check:** Before bumping, classify the state by comparing `VERSION` against the base branch AND against `package.json`'s `version` field. Four states: FRESH (do bump), ALREADY_BUMPED (skip bump), DRIFT_STALE_PKG (sync pkg only, no re-bump), DRIFT_UNEXPECTED (stop and ask).
|
||||
|
||||
@@ -3047,3 +3047,51 @@ describe('GSTACK REVIEW REPORT delete-then-append flow', () => {
|
||||
expect(src).not.toContain('If it was found mid-file, move it');
|
||||
});
|
||||
});
|
||||
|
||||
describe('LEARNINGS_SEARCH resolver: query parameter', () => {
|
||||
// Lazy-load resolver and types after describe block to keep test file self-contained.
|
||||
const { generateLearningsSearch } = require('../scripts/resolvers/learnings');
|
||||
const { HOST_PATHS } = require('../scripts/resolvers/types');
|
||||
|
||||
const claudeCtx = {
|
||||
skillName: 'test',
|
||||
tmplPath: 'test/SKILL.md.tmpl',
|
||||
host: 'claude',
|
||||
paths: HOST_PATHS.claude,
|
||||
};
|
||||
const codexCtx = { ...claudeCtx, host: 'codex', paths: HOST_PATHS.codex };
|
||||
|
||||
test('no args → bash does not contain --query (backwards-compat)', () => {
|
||||
const out = generateLearningsSearch(claudeCtx);
|
||||
expect(out).not.toContain('--query');
|
||||
});
|
||||
|
||||
test('claude host + query=foo bar → both cross-project and project-scoped branches contain --query', () => {
|
||||
const out = generateLearningsSearch(claudeCtx, ['query=foo bar']);
|
||||
// Both branches of the if/else must carry the flag.
|
||||
const lines = out.split('\n').filter(l => l.includes('gstack-learnings-search'));
|
||||
expect(lines.length).toBeGreaterThanOrEqual(2);
|
||||
for (const line of lines) {
|
||||
expect(line).toContain('--query "foo bar"');
|
||||
}
|
||||
});
|
||||
|
||||
test('codex host + query=foo bar → codex bash variant contains --query', () => {
|
||||
const out = generateLearningsSearch(codexCtx, ['query=foo bar']);
|
||||
expect(out).toContain('--query "foo bar"');
|
||||
expect(out).toContain('$GSTACK_BIN/gstack-learnings-search');
|
||||
});
|
||||
|
||||
test('empty value query= → bash does not contain --query (locked semantics: falls through)', () => {
|
||||
const claudeOut = generateLearningsSearch(claudeCtx, ['query=']);
|
||||
expect(claudeOut).not.toContain('--query');
|
||||
const codexOut = generateLearningsSearch(codexCtx, ['query=']);
|
||||
expect(codexOut).not.toContain('--query');
|
||||
});
|
||||
|
||||
test('shell-injection chars in query= → throws at gen-time (defense in depth)', () => {
|
||||
for (const bad of ['$(whoami)', '`cmd`', 'a;b', 'a&b', 'a"b', 'a\\b', 'foo$x']) {
|
||||
expect(() => generateLearningsSearch(claudeCtx, [`query=${bad}`])).toThrow(/alphanumeric/);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
@@ -0,0 +1,60 @@
|
||||
import { describe, test, expect, beforeAll, afterAll } from 'bun:test';
|
||||
import * as fs from 'fs';
|
||||
import * as path from 'path';
|
||||
import * as os from 'os';
|
||||
import { execFileSync } from 'child_process';
|
||||
|
||||
const ROOT = path.resolve(import.meta.dir, '..');
|
||||
const BIN = path.join(ROOT, 'bin', 'gstack-learnings-search');
|
||||
|
||||
const tmpHome = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-search-test-'));
|
||||
const tmpCwd = fs.mkdtempSync(path.join(os.tmpdir(), 'gstack-search-cwd-'));
|
||||
// gstack-slug derives slug from git remote (none here) → falls back to basename of cwd.
|
||||
const slug = path.basename(tmpCwd).replace(/[^a-zA-Z0-9._-]/g, '');
|
||||
const projDir = path.join(tmpHome, 'projects', slug);
|
||||
|
||||
function run(args: string[]): string {
|
||||
return execFileSync(BIN, args, {
|
||||
env: { ...process.env, GSTACK_HOME: tmpHome },
|
||||
cwd: tmpCwd,
|
||||
encoding: 'utf-8',
|
||||
});
|
||||
}
|
||||
|
||||
beforeAll(() => {
|
||||
fs.mkdirSync(projDir, { recursive: true });
|
||||
const entries = [
|
||||
{ ts: '2026-05-01T00:00:00Z', skill: 'test', type: 'pattern', key: 'foo-pattern', insight: 'A foo-related insight', confidence: 8, source: 'observed', files: [] },
|
||||
{ ts: '2026-05-02T00:00:00Z', skill: 'test', type: 'pitfall', key: 'bar-pitfall', insight: 'A bar-related insight', confidence: 8, source: 'observed', files: [] },
|
||||
{ ts: '2026-05-03T00:00:00Z', skill: 'test', type: 'pattern', key: 'baz-pattern', insight: 'A baz-related insight', confidence: 8, source: 'observed', files: [] },
|
||||
];
|
||||
fs.writeFileSync(path.join(projDir, 'learnings.jsonl'), entries.map(e => JSON.stringify(e)).join('\n') + '\n');
|
||||
});
|
||||
|
||||
afterAll(() => {
|
||||
fs.rmSync(tmpHome, { recursive: true, force: true });
|
||||
fs.rmSync(tmpCwd, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
describe('gstack-learnings-search token-OR query semantics', () => {
|
||||
test('multi-token query returns entries matching ANY token', () => {
|
||||
const out = run(['--query', 'foo bar']);
|
||||
expect(out).toContain('foo-pattern');
|
||||
expect(out).toContain('bar-pitfall');
|
||||
expect(out).not.toContain('baz-pattern');
|
||||
});
|
||||
|
||||
test('single-token query returns only entries matching that token', () => {
|
||||
const out = run(['--query', 'foo']);
|
||||
expect(out).toContain('foo-pattern');
|
||||
expect(out).not.toContain('bar-pitfall');
|
||||
expect(out).not.toContain('baz-pattern');
|
||||
});
|
||||
|
||||
test('no --query flag returns all entries (backwards-compat)', () => {
|
||||
const out = run(['--limit', '10']);
|
||||
expect(out).toContain('foo-pattern');
|
||||
expect(out).toContain('bar-pitfall');
|
||||
expect(out).toContain('baz-pattern');
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user