exclude: tax-app from throughput analysis (import-dominated history)

tax-app's history is one commit of 104K logical lines — an initial
import of a codebase, not authored work. Removing it to keep the
comparison honest.

Changes:
- scripts/garry-output-comparison.ts: added EXCLUDED_REPOS constant
  with tax-app + a one-line rationale. The script now skips excluded
  repos with a stderr note and deletes any stale output JSON so
  aggregation loops don't pick up pre-exclusion numbers.

- README hero: updated to 810× run rate + 240× YTD (were 880×/260×).
  Wording updated to "40 public + private repos ... after excluding
  repos dominated by imported code."

- docs/ON_THE_LOC_CONTROVERSY.md: updated all numbers, added an
  "Exclusions" paragraph explaining tax-app, removed tax-app from
  the "shipped not WIP" example list.

New numbers (2026 through day 108, without tax-app):
  - To-date:  240× logical SLOC (1,233,062 vs 5,143)
  - Run rate: 810× per-day pace (11,417 vs 14 logical/day)
  - Annualized: ~4.2M logical lines projected

Future re-runs automatically skip tax-app. Add more exclusions to
EXCLUDED_REPOS at the top of the script with a one-line rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-18 12:23:20 +08:00
parent c5e2e1bf66
commit 28f7876ea5
3 changed files with 39 additions and 15 deletions
+22
View File
@@ -38,6 +38,14 @@ const GARRY_EMAILS = [
const TARGET_YEARS = [2013, 2026];
// Repos to skip entirely because their activity is dominated by imported code
// (initial commit that vendors an upstream codebase) rather than authored work.
// When the script is pointed at one of these, it emits a stderr note and exits
// without writing a per-repo JSON. Add more via PR with a one-line rationale.
const EXCLUDED_REPOS: Record<string, string> = {
'tax-app': 'single 104K-line initial import, not authored code',
};
type PerYearResult = {
year: number;
active: boolean;
@@ -284,6 +292,20 @@ function main() {
? args[repoRootIdx + 1]
: process.cwd();
// Check exclusion list — skip with stderr note if repo basename matches.
// Also delete any stale output JSON so aggregation loops don't pick up
// numbers from a pre-exclusion run.
const repoBasename = path.basename(path.resolve(repoRoot));
if (EXCLUDED_REPOS[repoBasename]) {
const staleOutput = path.join(repoRoot, 'docs', 'throughput-2013-vs-2026.json');
if (fs.existsSync(staleOutput)) fs.unlinkSync(staleOutput);
process.stderr.write(
`Skipping ${repoBasename}: ${EXCLUDED_REPOS[repoBasename]}\n` +
`(add/remove in EXCLUDED_REPOS at the top of this script)\n`
);
process.exit(0);
}
const sccAvailable = hasScc();
if (!sccAvailable) {
printSccHint();