Building token-efficient AI workflows: a practical guide
Concrete patterns for getting more out of Claude Code, Codex CLI and Gemini CLI without burning your monthly budget by lunchtime. Includes config snippets, hook recipes and a checklist.
You've probably done the math by now: a single Claude Code refactor session can cost more than a SaaS subscription. The fix isn't "use AI less" — it's use AI better.
Here's the working set of patterns we've collected from teams running DRIP in production for six months.
Pattern 1: Read once, edit often
Stop telling the agent to "look at the file" repeatedly. Once it has the file in context, ask for the change directly.
Bad:
Read app.py
Now read it again and tell me what the `parse_args` function does
Now refactor it to use argparse
Read it back and confirm
Good:
Refactor parse_args in app.py to use argparse. Show me the diff before applying.
Same outcome. Quarter the tokens. The agent handles the read/verify/write cycle internally — you don't need to choreograph it.
Pattern 2: Trust the delta certificate
When DRIP returns [DRIP: edit verified | hash: 0xa31… | 390 B], that is the verification. You don't need to ask the agent to re-read the file "just to be sure". The hash + touched-lines payload is cryptographically tied to the edit that just happened.
Modern agents respect these certificates by default. Older prompt patterns ("read the file again to verify") were calibrated for tooling that didn't have edit attestation. Drop them.
Pattern 3: Use partial reads when you know the section
If you're investigating a 600-line file but only care about lines 200–280, ask for that range:
Read app.py lines 200-280 and explain how the auth middleware works.
Most agents will translate this into a Read(file, offset: 199, limit: 80) call. DRIP handles partial reads with window-scoped diff: if you re-ask for the same window later, you get an [unchanged (lines 200-280)] sentinel instead of the full 80-line resend.
Pattern 4: Compact your context before you start a long task
If you've been chatting with the agent for an hour and you're about to ask it to do something demanding (big refactor, multi-file edit), compact first. Claude Code's /compact command summarises the conversation and resets the working context.
Without /compact, the agent re-reads files it already knows about because compaction is forced on it mid-task by the model's context limit. With manual compaction, you control when it happens — and DRIP's v9 context-compaction ledger tells you whether the agent had to re-send tokens after the reset.
Pattern 5: Pick the right model for the task
Not every task needs Opus 4.7.
| Task | Model | Why |
|---|---|---|
| One-line edit / typo fix | Haiku 4.5 | 50× cheaper, equivalent on small ops |
| Standard refactor (1 file) | Sonnet 4.6 | Sweet spot of capability vs. price |
| Multi-file architectural changes | Opus 4.7 | Only model that maintains coherence |
| Debugging across the stack | Sonnet 4.6 → Opus 4.7 | Start cheap, escalate on complexity |
In Claude Code: claude --model sonnet-4-6 for the cheap default, switch to Opus when the work warrants. In Codex CLI: codex --model gpt-5. In Gemini CLI: gemini --model gemini-2.5-pro.
DripMeter's cost projection panel lets you compare what the same session would have cost on each model, so you can calibrate after the fact.
Pattern 6: Watch the meter, set a daily target
A daily token-saved target is a surprisingly effective forcing function. Set it in DripMeter (default: 25K, but tune to your team's baseline). When you hit it, you've validated that your workflow is healthy. When you miss it three days running, you have a signal that something changed — usually you've started a task that doesn't benefit from caching (greenfield, no re-reads).
The 🔥 streak counter also surprises us — teams that consciously try to hit 7+ day streaks report ~20 % lower monthly bills than teams that don't track at all. Same workflows, same agents, same models. Just attention.
Pattern 7: Run DripMeter's report monthly
drip meter --json plus DripMeter's "Save report" action produces a Markdown summary you can drop into a monthly engineering update. It has:
- Lifetime tokens saved
- Dollar value at current model pricing
- Per-agent breakdown (which CLIs are pulling their weight)
- Top files (where you spend your read budget)
- 30-day rollup with daily average
Sharing this in a Slack channel keeps the team's attention on the win. It's also a great document to send to whoever signs the AWS bill.
Checklist
A quick self-audit. If you can answer yes to most of these, your token spend is in good shape.
-
dripis installed and wired into all three agents (drip init -g) - You can see today's savings in DripMeter without thinking about it
- You've set a daily token-saved target you actually hit most days
- You're using Haiku for trivial tasks, Sonnet for default, Opus only when needed
- You compact context manually before long sessions
- You don't ask the agent to re-read files "just to verify" after an edit
- You watch the compaction ledger and know how often your sessions hit it
- You ran the monthly usage report at least once and saved it somewhere
What's next
We're working on a Claude Code MCP server that surfaces DRIP's stats inline (so you can ask Claude "how much did I save today?" and have it answer from local data, zero network). Watch the DRIP repo for it.
In the meantime: install DRIP, wire it in, and forget about it. The win compounds.