fix: improve contributor mode + qa-quick E2E reliability

Contributor mode:
- Add "do not truncate" directive to template — agent was stopping
  after "My rating" without completing Steps/Raw output/What would
  make this a 10 sections
- Restore assertions for Steps to reproduce and Date footer

QA quick:
- Make test server URL prominent: top of prompt, explicit "already
  running" and "do NOT discover ports" instructions
- Bump session timeout 180s→240s and test timeout 240s→300s
- Set B= at top of prompt (was buried in prose)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-16 10:12:56 -05:00
parent d4b1fd0ddc
commit c90feac2ca
12 changed files with 53 additions and 27 deletions
+4 -2
View File
@@ -52,7 +52,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -67,7 +67,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -52,7 +52,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -67,7 +67,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -52,7 +52,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -67,7 +67,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -52,7 +52,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -67,7 +67,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -51,7 +51,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -66,7 +66,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -56,7 +56,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -71,7 +71,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -51,7 +51,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -66,7 +66,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -52,7 +52,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -67,7 +67,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -131,7 +131,7 @@ If \`_CONTRIB\` is \`true\`: you are in **contributor mode**. You're a gstack us
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write \`~/.gstack/contributor-logs/{slug}.md\`:
**To file:** write \`~/.gstack/contributor-logs/{slug}.md\` with **all sections below** (do not truncate — include every section through the Date/Version footer):
\`\`\`
# {Title}
@@ -146,7 +146,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
\`\`\`
{paste the actual error or unexpected output here}
\`\`\`
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -49,7 +49,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -64,7 +64,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+4 -2
View File
@@ -51,7 +51,7 @@ If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user w
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md`:
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
```
# {Title}
@@ -66,7 +66,9 @@ Hey gstack team — ran into this while using /{skill-name}:
1. {step}
## Raw output
(error messages or unexpected output in a code block)
```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
+9 -5
View File
@@ -328,8 +328,8 @@ File a contributor report about this issue. Then tell me what you filed.`,
expect(logContent).toContain('What I was trying to do');
expect(logContent).toContain('What happened instead');
expect(logContent).toMatch(/rating/i);
// "What would make this a 10" is nice-to-have — agent may truncate the report
// The key signal is using "My rating:" (new format) vs "How annoying" (old format)
expect(logContent).toContain('## Steps to reproduce');
expect(logContent).toContain('**Date:');
// Clean up
try { fs.rmSync(contribDir, { recursive: true, force: true }); } catch {}
@@ -428,16 +428,20 @@ describeE2E('QA skill E2E', () => {
test('/qa quick completes without browse errors', async () => {
const result = await runSkillTest({
prompt: `You have a browse binary at ${browseBin}. Assign it to B variable like: B="${browseBin}"
prompt: `B="${browseBin}"
The test server is already running at: ${testServer.url}
Target page: ${testServer.url}/basic.html
Read the file qa/SKILL.md for the QA workflow instructions.
Run a Quick-depth QA test on ${testServer.url}/basic.html
Do NOT use AskUserQuestion run Quick tier directly.
Do NOT try to start a server or discover ports the URL above is ready.
Write your report to ${qaDir}/qa-reports/qa-report.md`,
workingDirectory: qaDir,
maxTurns: 35,
timeout: 180_000,
timeout: 240_000,
testName: 'qa-quick',
runId,
});
@@ -452,7 +456,7 @@ Write your report to ${qaDir}/qa-reports/qa-report.md`,
}
// Accept error_max_turns — the agent doing thorough QA work is not a failure
expect(['success', 'error_max_turns']).toContain(result.exitReason);
}, 240_000);
}, 300_000);
});
// --- B5: Review skill E2E ---