mirror of https://github.com/garrytan/gstack.git synced 2026-06-17 15:20:11 +02:00

Files

T

Garry Tan c52d3ab303 refactor(preamble): carve CJK-escaping manual to on-demand doc

The AskUserQuestion format block is inlined into every interactive skill (~33).
It carried the full multi-paragraph non-ASCII/CJK escaping manual inline, but
that rationale only matters when a question contains CJK text and the operative
rule already lives in the always-loaded self-check. Moved the justification to
docs/askuserquestion-cjk.md (read on demand); kept the rule + a pointer.

Corpus: Claude-host SKILL.md total 3,087,499 -> 3,057,975 B (-29,524 B, ~900 B
x ~33 skills). Layer 0 still passes — the core decision-brief format stays
always-loaded; only the rare CJK rationale moved. Atomic with the all-host
regen (skill-docs.yml freshness gate). VERSION + package.json -> 1.58.0.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-01 23:04:41 -07:00

1.3 KiB

Raw Blame History

AskUserQuestion — non-ASCII / CJK characters

Read this on demand when an AskUserQuestion contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text. The operative rule is in the always-loaded AskUserQuestion self-check ("Non-ASCII characters written directly, NOT \u-escaped"); this doc is the full justification.

The rule

When any string field (question, option label, option description) contains non-ASCII text, emit the literal UTF-8 characters in the JSON string. Never escape them as \uXXXX.

Claude Code's tool parameter pipe is UTF-8 native and passes characters through unchanged. Only JSON-mandatory escapes remain allowed: \n, \t, \", \\.

Why escaping fails

Manually escaping requires recalling each codepoint from training, which is unreliable for long CJK strings — the model regularly emits the wrong codepoint. Example: writing ㄃ thinking it is 管 (U+7BA1), but ㄃ is actually ㄃, so the user sees 管理工具 rendered as ㄃3用箱.

The trigger is long, multi-line questions with hundreds of CJK characters: that is exactly when reflexive escaping kicks in and exactly when miscoding is most damaging. Long ≠ escape. Keep characters literal.

Wrong: "question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"
Right: "question": "請選擇管理工具"

1.3 KiB Raw Blame History

AskUserQuestion — non-ASCII / CJK characters

The rule

Why escaping fails

1.3 KiB

Raw Blame History