Why does image bloat cost Anthropic money on OAuth subscribers?

OAuth/Max Pro subscribers pay a flat monthly fee for unlimited (rate-limited) usage. Anthropic bears the compute cost per token. When stale images are re-sent on every API turn as raw base64, Anthropic processes hundreds of thousands of unnecessary tokens per session — pure margin erosion on flat-fee subscribers.

How much compute waste do dead images cause across the OpenClaw ecosystem?

Conservative estimates suggest ~367 billion wasted tokens per day across the OAuth subscriber base. At internal inference costs of $0.50-1.00 per million tokens, that's $183K-$367K daily, or $67M-$134M annually in compute spent processing images that provide zero value to conversations.

What is Shrink and how does it solve the image bloat problem?

Shrink is an open-source (MIT licensed) skill that replaces base64 images with structured Three-Tier Extraction — capturing CONTEXT (why the image was sent), DATA (every readable value), and VISUAL (design details). In production, it achieves 97% token reduction per image with zero information loss, at ~$0.0005 per image.

Why should image deflation be native instead of a third-party skill?

Native integration means every user benefits automatically. Server-side image caching by Anthropic or a built-in /shrink command in OpenClaw would eliminate the problem at the infrastructure level — no installation required, no configuration, just automatic optimization that saves compute and improves user experience.

Open Letter 6 min read

Image bloat squeezing API margins — compute costs rising while margins fall

An Open Letter to Anthropic and OpenClaw: Image Bloat Is Costing You Money

Your OAuth subscribers are burning compute on dead images. That's your margin walking out the door.

By Joe Loves Tech · March 29, 2026 · 6 min read

The Business Problem

Dear Anthropic and OpenClaw teams,

I run 10 AI agents on OpenClaw with a Claude Max Pro subscription. Yesterday I discovered my agents were carrying 675,000+ tokens of dead image data — screenshots, dashboards, error logs — all stored as raw base64 in session history, re-sent to your API on every single turn.

I'm an OAuth subscriber. I pay a flat monthly fee. Every token I send costs YOU compute, not me. And right now, a massive chunk of those tokens are images my agents will never look at again.

This isn't just my problem. It's an ecosystem-wide margin drain.

The Math That Should Worry You

Anthropic's perspective (OAuth subscribers)

OAuth/Max Pro subscribers pay ~$100-200/month for unlimited (rate-limited) usage. Anthropic bears the compute cost per token. Every unnecessary token processed is pure margin erosion.

Estimated across the OpenClaw OAuth subscriber base:

~12,500 active OpenClaw installations (conservative 5% of 250K stars)
~60-70% are OAuth subscribers (~8,000 instances)
Average image bloat per instance: ~54 images × ~17,000 tokens = ~918,000 tokens of dead weight
Tokens re-sent per turn: ~918,000
Average turns per day: ~50
Daily wasted tokens across OAuth base: ~367 BILLION tokens

That's 367 billion tokens per day that your GPUs are processing, your attention layers are computing, and your infrastructure is serving — for images that provide zero value to the conversation.

At estimated inference cost of ~$0.50-1.00 per million tokens (internal cost, not pricing):

Daily compute waste: $183,000 - $367,000
Monthly compute waste: $5.5M - $11M
Annual compute waste: $67M - $134M

These are rough estimates. But even at 10% of these numbers, that's $6.7M-$13.4M/year in compute spent processing dead images for flat-fee subscribers.

The OpenClaw Perspective

For OpenClaw, the problem manifests as user experience degradation:

Sessions hit context limits faster → users blame OpenClaw
/compact doesn't help (explicitly skips images) → users feel stuck
Rate limits hit sooner → users consider alternatives
Token usage appears wasteful → users question the platform

Every frustrated user who hits 100% usage because of image bloat is a potential churner. And the fix is straightforward.

What We Built (and why it should be native)

We built 🦐 Shrink — an open-source skill that replaces base64 images with Three-Tier Extraction™:

CONTEXT: Why the image was sent
DATA: Every readable value (text, numbers, IDs, dates)
VISUAL: Design details (colors, layout, spacing)

Results from our production fleet:

Metric	Result
Images processed	181
Output	Three-tier text descriptions
Tokens freed	3.3M
Reduction per image	97%
Total cost	$0.08
Information loss	Zero

It works. It's battle-tested. It's MIT licensed. But it shouldn't be a third-party skill — it should be native infrastructure.

The Proposal

To Anthropic

Server-side image caching — If an image hasn't been referenced in N turns, serve a cached description instead of re-processing the raw pixels. You already have the vision model. Use it once per image, not once per turn.
Image TTL in the Messages API — Let developers specify an image lifecycle. After N turns, automatically replace with a description. This alone could save your OAuth margins significantly.
Inference-layer deflation — Before an image enters the attention computation, check if it's been seen before. If so, substitute the cached representation. The compute savings compound across millions of users.

To OpenClaw

Native /shrink command — alongside /compact. /compact for text, /shrink for images. We have a working implementation ready for PR.
Auto-shrink on compaction — When /compact runs, it should handle images too, not skip them. Use the same three-tier extraction approach.
Image lifecycle management — Session configuration for image TTL. After N turns without reference, auto-extract and replace.
Session health metrics — Show users how much of their context window is images vs text. Awareness drives action.

The Numbers If You Act

If native image deflation reduced image tokens by 97% across the OAuth base:

Metric	Today	With Native Deflation
Daily image tokens (OAuth)	367B	~11B
Daily compute cost	$183K-$367K	$5.5K-$11K
Monthly savings	—	$5.3M-$10.7M
User context window freed	0%	~40-60%
Rate limit complaints	High	Significantly reduced
Subscriber satisfaction	Frustrated	Empowered

Why Open Source

Shrink is MIT licensed. We're not building a business — we're solving a problem. Take the code, take the approach, build it natively. We'll submit the PR.

Shrink your context. Not your capabilities.

— Joe Loves Tech (@joelovestech1)

🦐 getshrink.dev · github.com/joelovestech/shrink · clawhub install shrink

🦐 Try Shrink Today

Free up your context window. Open source. Zero information loss.

Install from ClawHub View on GitHub getshrink.dev