diff --git a/CHANGELOG.md b/CHANGELOG.md index 39cf4d550..64c1ee663 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,35 @@ # Changelog +## [1.58.0.0] - 2026-06-01 + +## **Every skill that asks you questions got a little lighter, all at once — the AskUserQuestion preamble stopped carrying its rare-case manuals inline.** + +The AskUserQuestion format block is inlined into every interactive skill (~33 of them). It carried the full multi-paragraph CJK / non-ASCII escaping manual inline, even though that rule only matters when a question contains Chinese, Japanese, or Korean text. The operative rule ("write non-ASCII characters literally, never `\u`-escape") already lives in the always-loaded self-check, so the long justification moved to `docs/askuserquestion-cjk.md`, read on demand. One change, every skill benefits. This is the preamble half of the token-reduction program: per-skill carves shrink one skill at a time, this shrinks the shared surface that rides on all of them. + +### The numbers that matter + +Measured across the Claude-host corpus (`cat SKILL.md */SKILL.md | wc -c`), regenerated for all hosts: + +| Metric | Before (v1.57) | After (v1.58) | Δ | +|--------|----------------|---------------|---| +| Claude-host skill corpus | 3,087,499 B | 3,057,975 B | -29,524 B | +| per interactive skill | full CJK manual inline | rule + 1 doc pointer | ~900 B each × ~33 | +| AUQ core format (Layer 0) | always-loaded | always-loaded (unchanged) | guaranteed | + +The core decision-brief format (ELI10, recommendation, pros/cons, stakes, self-check) is untouched and still always-loaded — Layer 0 enforces it. Only the rarely-needed CJK rationale moved on-demand. + +### What this means for you + +Nothing changes in how questions look or behave. For the rare CJK question, the agent reads one small doc for the full rationale; the operative rule was never removed. Every interactive skill is ~900 bytes lighter at the always-loaded layer. + +### Itemized changes + +#### Added +- `docs/askuserquestion-cjk.md` — full non-ASCII / CJK escaping rationale + worked example, read on demand. + +#### Changed +- The AskUserQuestion preamble block trims the inline CJK manual to the operative rule + a doc pointer; the self-check reminder stays always-loaded. + ## [1.57.0.0] - 2026-06-01 ## **/office-hours got 25% lighter, and there is now a test that proves slimming a skill never degrades the questions it asks you.** diff --git a/VERSION b/VERSION index a17d4bbc0..3a62339b5 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.57.0.0 +1.58.0.0 diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index f8c20cd59..e7bf2364f 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -371,25 +371,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/canary/SKILL.md b/canary/SKILL.md index e7a1715f8..93c442da6 100644 --- a/canary/SKILL.md +++ b/canary/SKILL.md @@ -363,25 +363,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/codex/SKILL.md b/codex/SKILL.md index af351d7f1..29b0f4615 100644 --- a/codex/SKILL.md +++ b/codex/SKILL.md @@ -366,25 +366,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/context-restore/SKILL.md b/context-restore/SKILL.md index 7a272722e..9d71e10b7 100644 --- a/context-restore/SKILL.md +++ b/context-restore/SKILL.md @@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/context-save/SKILL.md b/context-save/SKILL.md index 014407fbe..860780a77 100644 --- a/context-save/SKILL.md +++ b/context-save/SKILL.md @@ -366,25 +366,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/cso/SKILL.md b/cso/SKILL.md index ebacf1ac0..452512e7b 100644 --- a/cso/SKILL.md +++ b/cso/SKILL.md @@ -369,25 +369,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/design-consultation/SKILL.md b/design-consultation/SKILL.md index 9bab21e2d..a5629d204 100644 --- a/design-consultation/SKILL.md +++ b/design-consultation/SKILL.md @@ -389,25 +389,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/design-html/SKILL.md b/design-html/SKILL.md index f6e9e17f8..f69825849 100644 --- a/design-html/SKILL.md +++ b/design-html/SKILL.md @@ -370,25 +370,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/design-review/SKILL.md b/design-review/SKILL.md index e874a94aa..04077d821 100644 --- a/design-review/SKILL.md +++ b/design-review/SKILL.md @@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/design-shotgun/SKILL.md b/design-shotgun/SKILL.md index 9fd662ce6..38a9e3471 100644 --- a/design-shotgun/SKILL.md +++ b/design-shotgun/SKILL.md @@ -384,25 +384,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/devex-review/SKILL.md b/devex-review/SKILL.md index 14ed560d2..5e817b370 100644 --- a/devex-review/SKILL.md +++ b/devex-review/SKILL.md @@ -369,25 +369,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/docs/askuserquestion-cjk.md b/docs/askuserquestion-cjk.md new file mode 100644 index 000000000..54f4ac345 --- /dev/null +++ b/docs/askuserquestion-cjk.md @@ -0,0 +1,29 @@ +# AskUserQuestion — non-ASCII / CJK characters + +Read this on demand when an AskUserQuestion contains Chinese (繁體/簡體), +Japanese, Korean, or other non-ASCII text. The operative rule is in the +always-loaded AskUserQuestion self-check ("Non-ASCII characters written directly, +NOT \u-escaped"); this doc is the full justification. + +## The rule + +When any string field (question, option label, option description) contains +non-ASCII text, emit the literal UTF-8 characters in the JSON string. **Never +escape them as `\uXXXX`.** + +Claude Code's tool parameter pipe is UTF-8 native and passes characters through +unchanged. Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. + +## Why escaping fails + +Manually escaping requires recalling each codepoint from training, which is +unreliable for long CJK strings — the model regularly emits the wrong codepoint. +Example: writing `㄃` thinking it is 管 (U+7BA1), but `㄃` is actually ㄃, +so the user sees `管理工具` rendered as `㄃3用箱`. + +The trigger is long, multi-line questions with hundreds of CJK characters: that +is exactly when reflexive escaping kicks in and exactly when miscoding is most +damaging. Long ≠ escape. Keep characters literal. + +- Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` +- Right: `"question": "請選擇管理工具"` diff --git a/document-generate/SKILL.md b/document-generate/SKILL.md index 2c7e6f072..e875878bb 100644 --- a/document-generate/SKILL.md +++ b/document-generate/SKILL.md @@ -369,25 +369,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/document-release/SKILL.md b/document-release/SKILL.md index 43ba9adb1..4be1d9a6c 100644 --- a/document-release/SKILL.md +++ b/document-release/SKILL.md @@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/health/SKILL.md b/health/SKILL.md index 921a7b5b4..08881e6fc 100644 --- a/health/SKILL.md +++ b/health/SKILL.md @@ -365,25 +365,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/investigate/SKILL.md b/investigate/SKILL.md index daf6be6d8..8ff40fe32 100644 --- a/investigate/SKILL.md +++ b/investigate/SKILL.md @@ -404,25 +404,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/ios-clean/SKILL.md b/ios-clean/SKILL.md index c9073b6d5..1496ce328 100644 --- a/ios-clean/SKILL.md +++ b/ios-clean/SKILL.md @@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/ios-design-review/SKILL.md b/ios-design-review/SKILL.md index 7bfbdd851..eaef04448 100644 --- a/ios-design-review/SKILL.md +++ b/ios-design-review/SKILL.md @@ -369,25 +369,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/ios-fix/SKILL.md b/ios-fix/SKILL.md index 2d1c3d4b1..68f120baa 100644 --- a/ios-fix/SKILL.md +++ b/ios-fix/SKILL.md @@ -370,25 +370,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/ios-qa/SKILL.md b/ios-qa/SKILL.md index 0d40c16e5..b9b2ec824 100644 --- a/ios-qa/SKILL.md +++ b/ios-qa/SKILL.md @@ -373,25 +373,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/ios-sync/SKILL.md b/ios-sync/SKILL.md index e7a803924..4aaa9da4a 100644 --- a/ios-sync/SKILL.md +++ b/ios-sync/SKILL.md @@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/land-and-deploy/SKILL.md b/land-and-deploy/SKILL.md index 2eb9faa6c..124c78700 100644 --- a/land-and-deploy/SKILL.md +++ b/land-and-deploy/SKILL.md @@ -362,25 +362,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/landing-report/SKILL.md b/landing-report/SKILL.md index aec9978ba..c64bd4932 100644 --- a/landing-report/SKILL.md +++ b/landing-report/SKILL.md @@ -363,25 +363,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/learn/SKILL.md b/learn/SKILL.md index 08a78b23c..d8d720366 100644 --- a/learn/SKILL.md +++ b/learn/SKILL.md @@ -365,25 +365,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/office-hours/SKILL.md b/office-hours/SKILL.md index 761eb6a74..6303e12c1 100644 --- a/office-hours/SKILL.md +++ b/office-hours/SKILL.md @@ -400,25 +400,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/open-gstack-browser/SKILL.md b/open-gstack-browser/SKILL.md index 64a93770e..f2a240bee 100644 --- a/open-gstack-browser/SKILL.md +++ b/open-gstack-browser/SKILL.md @@ -362,25 +362,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/package.json b/package.json index a9440ba15..068a9272c 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.57.0.0", + "version": "1.58.0.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", diff --git a/pair-agent/SKILL.md b/pair-agent/SKILL.md index 533a29dc7..0fb09f241 100644 --- a/pair-agent/SKILL.md +++ b/pair-agent/SKILL.md @@ -364,25 +364,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index ffba939b1..d0e3a1f07 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -394,25 +394,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/plan-design-review/SKILL.md b/plan-design-review/SKILL.md index b1b110ae1..6a349d586 100644 --- a/plan-design-review/SKILL.md +++ b/plan-design-review/SKILL.md @@ -366,25 +366,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/plan-devex-review/SKILL.md b/plan-devex-review/SKILL.md index 7336b70a5..ba7c33899 100644 --- a/plan-devex-review/SKILL.md +++ b/plan-devex-review/SKILL.md @@ -372,25 +372,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index c4ec10bb6..51017a081 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -370,25 +370,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/plan-tune/SKILL.md b/plan-tune/SKILL.md index 41c45342e..fa6d88551 100644 --- a/plan-tune/SKILL.md +++ b/plan-tune/SKILL.md @@ -375,25 +375,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index db1c3dd08..91cbac6fa 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -365,25 +365,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/qa/SKILL.md b/qa/SKILL.md index c5fdf9b56..d22bb1bfd 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -371,25 +371,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/retro/SKILL.md b/retro/SKILL.md index 287f24e35..9bee0b745 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -382,25 +382,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/review/SKILL.md b/review/SKILL.md index 4d8049d54..0b83a3bd6 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/scrape/SKILL.md b/scrape/SKILL.md index 0af5db506..72c2962d0 100644 --- a/scrape/SKILL.md +++ b/scrape/SKILL.md @@ -363,25 +363,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/scripts/resolvers/preamble/generate-ask-user-format.ts b/scripts/resolvers/preamble/generate-ask-user-format.ts index e71b39e41..ab24eb507 100644 --- a/scripts/resolvers/preamble/generate-ask-user-format.ts +++ b/scripts/resolvers/preamble/generate-ask-user-format.ts @@ -75,25 +75,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see \`docs/askuserquestion-split.md\` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \\u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as \`\\uXXXX\`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes \`\\u3103\` thinking it is 管 U+7BA1, but \`\\u3103\` is - actually ㄃, so the user sees \`管理工具\` rendered as \`㄃3用箱\`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: \`"question": "請選擇\\uXXXX\\uXXXX\\uXXXX\\uXXXX"\` - Right: \`"question": "請選擇管理工具"\` - - Only JSON-mandatory escapes remain allowed: \`\\n\`, \`\\t\`, \`\\"\`, \`\\\\\`. +**Non-ASCII characters — write directly, never \\u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as \`\\uXXXX\` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only \`\\n\`, +\`\\t\`, \`\\"\`, \`\\\\\` remain allowed. Full rationale + worked example: see +\`docs/askuserquestion-cjk.md\`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/setup-deploy/SKILL.md b/setup-deploy/SKILL.md index a35ab9764..fe32c61e4 100644 --- a/setup-deploy/SKILL.md +++ b/setup-deploy/SKILL.md @@ -366,25 +366,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/setup-gbrain/SKILL.md b/setup-gbrain/SKILL.md index cad27fcec..5fb3c4050 100644 --- a/setup-gbrain/SKILL.md +++ b/setup-gbrain/SKILL.md @@ -365,25 +365,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/ship/SKILL.md b/ship/SKILL.md index ecf203787..3586f13ec 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/skillify/SKILL.md b/skillify/SKILL.md index e7911473e..f558dd594 100644 --- a/skillify/SKILL.md +++ b/skillify/SKILL.md @@ -363,25 +363,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/spec/SKILL.md b/spec/SKILL.md index 7279b9c37..4289931f7 100644 --- a/spec/SKILL.md +++ b/spec/SKILL.md @@ -364,25 +364,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting @@ -1375,25 +1362,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting diff --git a/sync-gbrain/SKILL.md b/sync-gbrain/SKILL.md index 4a3a5bc1d..17be5c34c 100644 --- a/sync-gbrain/SKILL.md +++ b/sync-gbrain/SKILL.md @@ -365,25 +365,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr **Full rule + worked examples + Hold/dependency semantics:** see `docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4. -**Non-ASCII characters — write directly, never \u-escape.** When any - string field (question, option label, option description) contains - Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit - the literal UTF-8 characters in the JSON string. **Never escape them - as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native - and passes characters through unchanged. Manually escaping requires - recalling each codepoint from training, which is unreliable for long - CJK strings — the model regularly emits the wrong codepoint (e.g. - writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is - actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`). - The trigger is long, multi-line questions with hundreds of CJK - characters: that is exactly when reflexive escaping kicks in and - exactly when miscoding is most damaging. Long ≠ escape. Keep - characters literal. - - Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"` - Right: `"question": "請選擇管理工具"` - - Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`. +**Non-ASCII characters — write directly, never \u-escape.** When any string +field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, +emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is +UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`, +`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see +`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK. ### Self-check before emitting