Open Letter 6 min read
Image bloat squeezing API margins — compute costs rising while margins fall

An Open Letter to Anthropic and OpenClaw: Image Bloat Is Costing You Money

Your OAuth subscribers are burning compute on dead images. That's your margin walking out the door.

By Joe Loves Tech · March 29, 2026 · 6 min read


The Business Problem

Dear Anthropic and OpenClaw teams,

I run 10 AI agents on OpenClaw with a Claude Max Pro subscription. Yesterday I discovered my agents were carrying 675,000+ tokens of dead image data — screenshots, dashboards, error logs — all stored as raw base64 in session history, re-sent to your API on every single turn.

I'm an OAuth subscriber. I pay a flat monthly fee. Every token I send costs YOU compute, not me. And right now, a massive chunk of those tokens are images my agents will never look at again.

This isn't just my problem. It's an ecosystem-wide margin drain.

The Math That Should Worry You

Anthropic's perspective (OAuth subscribers)

OAuth/Max Pro subscribers pay ~$100-200/month for unlimited (rate-limited) usage. Anthropic bears the compute cost per token. Every unnecessary token processed is pure margin erosion.

Estimated across the OpenClaw OAuth subscriber base:

That's 367 billion tokens per day that your GPUs are processing, your attention layers are computing, and your infrastructure is serving — for images that provide zero value to the conversation.

At estimated inference cost of ~$0.50-1.00 per million tokens (internal cost, not pricing):

These are rough estimates. But even at 10% of these numbers, that's $6.7M-$13.4M/year in compute spent processing dead images for flat-fee subscribers.

The OpenClaw Perspective

For OpenClaw, the problem manifests as user experience degradation:

Every frustrated user who hits 100% usage because of image bloat is a potential churner. And the fix is straightforward.

What We Built (and why it should be native)

We built 🦐 Shrink — an open-source skill that replaces base64 images with Three-Tier Extraction™:

Results from our production fleet:

Metric Result
Images processed181
OutputThree-tier text descriptions
Tokens freed3.3M
Reduction per image97%
Total cost$0.08
Information lossZero

It works. It's battle-tested. It's MIT licensed. But it shouldn't be a third-party skill — it should be native infrastructure.

The Proposal

To Anthropic

  1. Server-side image caching — If an image hasn't been referenced in N turns, serve a cached description instead of re-processing the raw pixels. You already have the vision model. Use it once per image, not once per turn.
  2. Image TTL in the Messages API — Let developers specify an image lifecycle. After N turns, automatically replace with a description. This alone could save your OAuth margins significantly.
  3. Inference-layer deflation — Before an image enters the attention computation, check if it's been seen before. If so, substitute the cached representation. The compute savings compound across millions of users.

To OpenClaw

  1. Native /shrink command — alongside /compact. /compact for text, /shrink for images. We have a working implementation ready for PR.
  2. Auto-shrink on compaction — When /compact runs, it should handle images too, not skip them. Use the same three-tier extraction approach.
  3. Image lifecycle management — Session configuration for image TTL. After N turns without reference, auto-extract and replace.
  4. Session health metrics — Show users how much of their context window is images vs text. Awareness drives action.

The Numbers If You Act

If native image deflation reduced image tokens by 97% across the OAuth base:

Metric Today With Native Deflation
Daily image tokens (OAuth)367B~11B
Daily compute cost$183K-$367K$5.5K-$11K
Monthly savings$5.3M-$10.7M
User context window freed0%~40-60%
Rate limit complaintsHighSignificantly reduced
Subscriber satisfactionFrustratedEmpowered

Why Open Source

Shrink is MIT licensed. We're not building a business — we're solving a problem. Take the code, take the approach, build it natively. We'll submit the PR.

Shrink your context. Not your capabilities.

Joe Loves Tech (@joelovestech1)

🦐 getshrink.dev · github.com/joelovestech/shrink · clawhub install shrink

🦐 Try Shrink Today

Free up your context window. Open source. Zero information loss.