← knowledge.oriz.in

Zero-cost inference backends — Ollama + Cloudflare Workers AI + Puter.js

decision aiinferenceollamacloudflare-workers-aiputer-jsno-cardgrill-decisionfallback-ladder

Zero-cost inference backends (2026-06-30 grill)

Decision

Three zero-cost model backends are approved for oriz workflows, each marking a distinct deployment surface:

Backend Surface Free tier Card? Role
Ollama Local, dev machine Unlimited (your GPU) n/a Primary dev runtime; offline; CI on workstation
Cloudflare Workers AI Serverless, edge Worker 10,000 neurons/day NO Primary serverless runtime; prod-side inference
Puter.js Browser, end-user pays Unlimited from our side NO (end-user may optionally add one to their Puter account) User-facing chat and on-page AI features

All three pass the no-card-on-file hard rule. All three already have service entries in knowledge/services/business/ai/. This decision codifies them as a single fallback ladder end-to-end.

Why a ladder (not pick-one)

Routing rules

Workload Backend (first pick) Fallback
Dev on laptop, no network / offline Ollama (localhost:11434/v1/chat/completions) Cloudflare Workers AI
Prod inference inside a Cloudflare Worker Cloudflare Workers AI (env.AI.bind() native binding) Puter.js dispatched to client
User-facing chat in browser Puter.js Fall back to Workers AI for hard server-side steps
Open-source CLI agent failover (any of Aider, Cline, Kilo Code, OpenCode, gocode, Coddy) Ollama at localhost:11434 (all confirmed OpenAI-compat) Cloudflare Workers AI over HTTPS
Free-tier hosted Google model for general chat Gemini CLI — see gemini-cli-agent-addition-2026-06-30 n/a (no public REST)

Per-backend details

Quota invariants (per never-hit-quotas)

Backend Soft alarm trip Cap
Cloudflare Workers AI 5,000 neurons/day (50%) 10,000 neurons/day hard cap
Local Ollama Disk space No API-side cap
Puter.js n/a (end-user pays) Per-user at Puter's discretion
Gemini CLI 600 req/day (60%); 36 req/min (60%) 1,000 req/day, 60 req/min

What this decision does NOT do

Cross-refs