Repo code-size ceiling — WARN at 100K tokens for own repos, advisory only, forks exempt
Repo code-size ceiling — advisory WARN at 100K
Rule
Own repos (repos/own/** and chirag127/* on GitHub):
- WARN at 100,000 tokens of executable code
- Advisory only — the audit prints a warning and continues. Never fails CI.
- No hard failure threshold. Splitting is a judgement call, not a gate.
Forks (repos/frk/**): exempt. Byte-identical to upstream per no-fork-divergence.
Applies to code only: .rs, .ts, .tsx, .js, .jsx, .mjs, .py, .go, .java, .swift, .cpp, .c, .rb, .php, .cs, .lua, .sh, .ps1, .vue, .svelte, .astro, .sql, .graphql, .proto, .css, .scss, .html. Excludes: .md, .mdx, .json, .jsonc, .yml, .yaml, .toml, lockfiles, .env*, images, binaries.
Rationale
- 100K tokens ≈ 7.5K LOC of source. Fits comfortably in one AI-agent context turn with plenty of margin for retrieval + system prompt.
- Aggressive but not blocking. A repo drifting to 150K is a nudge, not a broken build.
- Merging repos to hit 100K is worse than a slightly-over-threshold repo — atomicity of purpose > raw size.
Current state (2026-07-03 audit)
18 own repos audited:
| Repo | Tokens | Status |
|---|---|---|
bookmark-mind-bs-ext |
363K | WARN (3.6× threshold) |
hermes-config |
137K | WARN |
agent-skills |
102K | WARN |
| 15 others | <50K each | pass |
3 own repos above 100K. All warned, none blocked, none forced to split.
Enforcement
Dagger TS module at dagger/ (per pipeline-stack-2026-07-01):
# Locally
cd dagger && dagger call audit-repo-tokens --source ..
# Or the Node fallback (no Dagger daemon)
node scripts/audit-repo-code-tokens.mjs
GHA workflow .github/workflows/repo-size-audit.yml runs the Dagger call weekly + on every PR touching .gitmodules or dagger/**. Advisory only — never fails. The workflow surfaces WARN in job output. Reviewers decide if it matters.
How to split when a repo genuinely feels too big
Per atomic-packages-lazy:
- Identify a ≥3-5 export subset with independent lifecycle
- Extract into
chirag127/<subset-name> - Publish as npm/pypi/cargo package if third parties would use it, else as git submodule
- Delete extracted code from parent; update imports
When NOT to split:
- Repo is above WARN but code is genuinely cohesive (bookmark-mind-bs-ext keeps ext + tests + docs together intentionally)
- Extraction candidate has <3 exports — extraction thrash exceeds gain
- The repo is single-purpose (an app, a service, an extension)
Fork oversize policy
- Never split a fork — breaks
no-fork-divergence. - Optional advocacy: file an issue upstream. See precedents: screenpipe#4890, BCU#955, OmniRoute#6065.
Anti-patterns
- ❌ Splitting a fork to hit the ceiling — hard rule violation
- ❌ Making the GHA fail (blocking merge) on size — advisory only
- ❌ Merging repos to shrink a related-but-lifecycle-independent split
- ❌ Applying the threshold to markdown — knowledge/ bundle can grow freely
- ❌ Chasing the threshold by moving code into
node_modulesor vendored dirs — the audit normalizes those out