Vision — Anthropic - Dangerous Robot

Summary

Technical guide on Claude's vision capabilities, covering image upload methods, token calculation, resolution limits, and model-specific performance for Claude Opus 4.7.

Key quotes

Claude Opus 4.7 is the first Claude model with high-resolution image support.

Claude cannot be used to name people in images and refuses to do so.

Claude's spatial reasoning abilities are limited.

The documentation provides detailed specifications for multimodal interaction, including the formula for calculating image tokens (width * height / 750) and limits on the number of images per request. It contrasts the capabilities of Claude Opus 4.7 with previous models.