Giving Claude a Crop Tool for Better Image Analysis
When Claude analyzes images, it sees the entire image at once. For detailed tasks—like reading small text, comparing similar values in a chart, or examining fine details—this can be limiting.
The solution: Give Claude a tool that lets it "zoom in" by cropping regions of interest.
This notebook shows how to build a simple crop tool and demonstrates when it's useful.
When is a Crop Tool Useful?
- Charts and graphs: Comparing bars/lines that are close in value, reading axis labels
- Documents: Reading small text, examining signatures or stamps
- Technical diagrams: Following wires/connections, reading component labels
- Dense images: Any image where details are small relative to the whole
Setup
Note: you may need to restart the kernel to use updated packages.
Load an Example Chart
We'll use a chart from the FigureQA dataset to demonstrate.
Question: Is Cyan the minimum? Answer: Yes.
Define the Crop Tool
The crop tool uses normalized coordinates (0-1) so Claude doesn't need to know the image dimensions:
(0, 0)= top-left corner(1, 1)= bottom-right corner(0.5, 0.5)= center
Let's test the crop tool manually:
Cropped to (0.00,0.00)-(0.40,0.35): 167x140px
The Agentic Loop
Now we connect everything: send the image to Claude with the crop tool available, and handle tool calls in a loop until Claude provides a final answer.
Demo: Chart Analysis
Let's ask Claude to analyze our chart. Watch how it uses the crop tool to examine specific regions.
Question: Is Cyan the minimum?
Claude's analysis:
[Assistant] I'll help you answer whether Cyan is the minimum in this pie chart. Let me examine the image more closely.
[Tool] crop_image({'x1': 0.0, 'y1': 0.0, 'x2': 0.3, 'y2': 0.3})
[Tool] crop_image({'x1': 0.3, 'y1': 0.3, 'x2': 0.7, 'y2': 0.7})
[Assistant] Based on my examination of the pie chart, I can see the relative sizes of each segment: 1. **Royal Blue** - appears to be the largest segment, taking up roughly 35-40% of the pie 2. **Peru** (brown/orange) - appears to be the second largest, roughly 25-30% 3. **Red** - appears to be third, roughly 20% 4. **Light Slate** (gray) - appears to be fourth, roughly 15% 5. **Cyan** - appears to be the smallest segment, roughly 5-10% **Answer: Yes, Cyan is the minimum.** Cyan has the smallest slice in the pie chart, making it the minimum value among all five categories shown. Ground truth: Yes.
Try Another Example
Question: Is Forest Green greater than Medium Orchid?
Claude's analysis:
[Assistant] I'll help you answer whether Forest Green is greater than Medium Orchid. Let me first examine the image to identify these colors and their values.
[Tool] crop_image({'x1': 0, 'y1': 0, 'x2': 0.3, 'y2': 1})
[Tool] crop_image({'x1': 0, 'y1': 0.85, 'x2': 1, 'y2': 1})
[Assistant] Based on the chart, I can see: - **Forest Green** (the first bar on the left, dark green color) has a value of approximately 35 - **Medium Orchid** (the second bar, purple/orchid color) has a value of approximately 34 **Answer: Yes, Forest Green is greater than Medium Orchid.** Forest Green has a slightly higher value (around 35) compared to Medium Orchid (around 34), though the difference is quite small. Ground truth: Yes.
Summary
The crop tool pattern is simple but powerful:
- Define a tool that takes normalized bounding box coordinates
- Return the cropped image as base64 in the tool result
- Let Claude decide when and where to crop
This works because Claude can see the full image first, identify regions that need closer inspection, and iteratively zoom in.
Alternative: Using the Claude Agent SDK
The Claude Agent SDK provides a cleaner way to define tools using Python decorators and handles the agentic loop automatically.
Note: you may need to restart the kernel to use updated packages.
Question: Is Cyan the minimum?
[Assistant] I'll first read the image to understand its content, then examine specific regions if needed.
[Tool] Read({'file_path': 'chart.png'})
[Assistant] Looking at this pie chart, I can clearly see the different segments and their relative sizes. Let me crop the area showing the Cyan segment to examine it more closely.
[Tool] mcp__crop__crop_image({'image_path': 'chart.png', 'x1': 0.4, 'y1': 0.6, 'x2': 0.7, 'y2': 0.9})
[Assistant] Now I can clearly analyze the chart. Looking at the pie chart: **Yes, Cyan is the minimum.** The pie chart shows 5 categories with the following relative sizes (from largest to smallest): 1. **Royal Blue** - the largest segment (takes up roughly half the pie) 2. **Peru** (tan/brown) - second largest 3. **Red** - medium-sized segment 4. **Light Slate** (gray) - smaller segment 5. **Cyan** - the smallest segment The Cyan segment is clearly the thinnest slice of the pie, making it the minimum value among all the categories shown.