How much does Shrink save on tokens?

Shrink typically reduces image tokens by 96-99%. A raw image costs ~18,000 tokens; after Shrink processing, the three-tier description uses only ~500 tokens. In production testing across 10 agents, Shrink saved over 3.3 million tokens.

How does three-tier extraction work?

Shrink produces three complementary tiers: CONTEXT (why the image matters in conversation), DATA (every readable value extracted), and VISUAL (design, layout, colors, spacing). Together they eliminate any reason to reference the raw image.

How do I install Shrink?

Install via ClawHub with: clawhub install shrink. Then use /shrink in chat or run the CLI directly with: python3 shrink.py --agent main

Shrink

Tired of hitting 100% usage because your agent can't let go of screenshots?

Every image in your agent's memory gets re-sent on every turn. Shrink replaces them with three-tier text descriptions. Same information. 97% fewer tokens. No more depression naps.

Get Started 📦 Install from ClawHub

$ clawhub install shrink

The Problem

You're burning tokens on images
your agent will never look at again.

That screenshot from 20 turns ago? Still in context. Still costing you tokens. Still pushing you toward that red 100% bar. And /compact can't help — it explicitly skips images.

Before Shrink

tokens per turn (5 images)

→

After Shrink

tokens per turn (5 images)

You've tried /compact — it nukes your text and skips images entirely. You've tried new sessions — goodbye context. You've tried ignoring it — hello rate limits.
What if you could keep every detail from every image in ~500 tokens instead of ~18,000?

How It Works

Three-Tier Extraction

Every question an agent could ask about an image falls into three buckets. Shrink captures all three.

Tier 1 — Context

Why the image matters

"SSH fingerprint verification dialog on Android, connecting to 192.168.86.194:222. User is setting up first-time SSH access from Samsung S24 phone."

Tier 2 — Data

Every readable value

Tier 3 — Visual

Design, layout, colors

"Dark theme (#1a1a1a). Dialog centered with rounded corners. Cancel in red, Continue in green. Sans-serif typography, 24px spacing."

❌ Before — Raw base64 in session

{"type":"image","data":"iVBORw0KGgoAAAANSUhEUgAABQAAAA UACAYAAACQB2wLAAAABHNCSVQICAgIfAhkiAAAA BmJLR0QA/wD/AP+gvaeTAAAACXBIWXMAAA7DAA AOxAGVKw4bAAAAB3RJTUUH4wYZBx0JExkRkAAA AaBpVFh0Q29tbWVudAAAAAAAQ3JlYXRlZCBieS BTY3JlZW5zaG90IFNvZnR3YXJlLiBDb3B5cmln aHQgwqkgMjAxOSBTY3JlZW5zaG90IFNvZnR3YX JlLCBJbmMuIEFsbCByaWdodHMgcmVzZXJ2ZWQu a1c5cPk2V7x9HqF...

~18,000 tokens

✅ After — Structured description

[🖼️ Image deflated by Shrink] CONTEXT: SSH fingerprint verification dialog on Android, connecting to 192.168.86.194:222. DATA: Time: 7:24 AM | Battery: 92% Host: Server | Key: ECDSA | Fingerprint: SHA256:D7BJ/VjYo2k6uFn31lm9TKtbF2Gu... VISUAL: Dark theme (#1a1a1a). Dialog centered, rounded corners. Cancel red, Continue green. Sans-serif, 24px spacing.

~500 tokens (97% savings)

Real Results

Production-tested across a 10-agent fleet

Real numbers from a real deployment. Not benchmarks — production data.

Images Processed

Tokens Saved

Total Cost

0.11

Turns to ROI

Approach	Token Cost	Info Preserved	Images?
Raw (no optimization)	~18,000/img	100%	✅
Image compression	~2,000/img	~85% (quality loss)	✅
/compact	Varies	~30-50% (summarized)	❌ Skipped
Context-only	~100/img	~60%	✅
🦐 Shrink (three-tier)	~500/img	~99%	✅

Features

Everything you need, nothing you don't

🧠

Three-Tier Extraction

Context + data + visual design in every description

♻️

Dedup Detection

Identical images share one API call. Saved 28% in production.

🔄

Auth Failover

Auto-rotates between API keys and OAuth tokens

📊

Fleet Management

Shrink every session for any agent in one command

💰

Cost Estimates

See API cost before running, with dedup savings

📋

JSON Output

Structured output for pipelines and automation

🎛️

12 CLI Flags

Fine-tune everything: model, depth, budget, detail level

🔒

Privacy-First

No telemetry. Only calls Anthropic API. Full --dry-run support.

Quick Start

Up and running in 30 seconds

Terminal

# Install from ClawHub
clawhub install shrink
        

          In chat
        
# Shrink your current session
/shrink

# Interactive — select agents, set options
/shrink --interactive

          CLI
          
# Preview what's in your session
python3 shrink.py --agent main --dry-run

# Shrink all sessions for an agent
python3 shrink.py --agent yancy --all-sessions

# Budget-conscious: limit + cheaper model
python3 shrink.py --agent main --max-images 10 --model claude-haiku-4-5

# JSON output for automation
python3 shrink.py --agent main --json

FAQ

Frequently Asked Questions

What is Shrink?

Shrink is a multimodal context optimizer for AI agents. It replaces base64-encoded images in session history with rich three-tier text descriptions (Context + Data + Visual), achieving 96-99% token reduction with zero information loss.

How much does it save on tokens?

A raw image costs ~18,000 tokens. After Shrink, the three-tier description uses ~500 tokens — a 97% reduction. In production across 10 agents, Shrink saved over 3.3 million tokens for $0.26 total.

Does it work with /compact?

Yes — they're complementary. /compact summarizes text but explicitly skips images. Shrink handles only images and leaves text untouched. Use Shrink first, then /compact as a last resort.

Is my data sent anywhere?

Images and surrounding context are sent only to the Anthropic vision API for description generation. No telemetry, no data collection, no other external services. Use --dry-run to preview everything before committing.

What frameworks does it support?

Shrink is built for OpenClaw and works with any agent that stores sessions as JSONL files. Install via ClawHub or run the CLI directly. Multi-framework support is on the roadmap.

How is this different from image compression?

Image compression (resize/quality reduction) saves ~80-90% of tokens but degrades text readability. Shrink extracts all data as text first, then removes the image — 98-99% savings with zero information loss.

Shrink

You're burning tokens on imagesyour agent will never look at again.

Three-Tier Extraction

Why the image matters

Every readable value

Design, layout, colors

Production-tested across a 10-agent fleet

Everything you need, nothing you don't

Up and running in 30 seconds

Frequently Asked Questions

You're burning tokens on images
your agent will never look at again.