Data lives in each app's own repo — no separate data repos for janaushdhi/ncert/financial-cards
Data lives in each app's own repo
Decision
Each data-driven app (oriz-janaushdhi-app, oriz-ncert-app, oriz-financial-cards-app, etc.) keeps its own data in its own data/ directory inside its own GitHub repo.
NO separate oriz-*-data repos created for any of them.
Existing API service repos (oriz-flow-fii-dii-activity-api, oriz-mmi-tickertape-mmi-api) stay — they're services with data/ dirs of their own, served via GH Pages.
Why
User mandate verbatim: "None of them require a separate data repo. All data in the repo of their creation. We are moving to the monorepo. I don't want to increase the number of repositories just for the sake of it."
51 submodules is enough. Adding 3-5 more -data repos for the sake of architectural purity isn't worth the maintenance overhead.
How data updates work
Per app, daily/weekly/monthly cron in .github/workflows/scrape.yml:
- Scraper script runs (e.g. Playwright fetches the medicine CSV)
- Output ?
data/<YYYY-MM-DD>.json+data/latest.jsonin the app repo - Workflow commits + pushes to main
- CF Pages auto-redeploys on push
- Site rebuilds with the fresh data baked in
App-level GH Action handles everything; zero external coordination.
Runtime fetch for freshness
Where data MUST be live (intraday market data, live counters), apps lazy-fetch from raw URLs:
paisa-financefetches FII/DII + MMI fromraw.githubusercontent.com/chirag127/oriz-flow-fii-dii-activity-api/main/data/latest.json+ similar for MMI- Lazy + SWR (stale-while-revalidate) + localStorage 1h TTL — shows cached immediately, fetches fresh in background
Cross-refs
- Market data per-repo pattern ? [[decisions/ops/market-data-per-repo]]
- janaushdhi app scope ? [[decisions/apps/janaushdhi-app-scope]]
- ncert combined PDF directory ? [[decisions/apps/ncert-combined-pdf-directory]]