What This Skill Does

The TranslateImage skill empowers your OpenClaw agent to process, understand, and modify text within visual media. By leveraging the TranslateImage REST API, the agent can perform high-fidelity OCR (Optical Character Recognition), translate foreign-language text while maintaining the original design, and perform image inpainting to remove unwanted text overlays. It acts as an intelligent layer between raw pixels and actionable information, making it an essential tool for cross-lingual accessibility and document analysis.

Installation

You can integrate this skill into your local OpenClaw environment by running the following terminal command: clawhub install openclaw/skills/skills/cottom/translate-image

Once installed, ensure you have obtained your API key from translateimage.io and set it as an environment variable: export TRANSLATEIMAGE_API_KEY=your-api-key. This enables the agent to authenticate securely with the TranslateImage backend.

Use Cases

Manga/Comic Localization: Read foreign-language comics by translating dialogue bubbles in-place without ruining the art.
Signage Translation: Upload photos of street signs or menus while traveling to instantly see English text layered over the original image.
Data Extraction: Convert screenshots of tables or documents into structured text data using the built-in OCR capabilities.
Content Cleaning: Remove distracting watermark text or captions from images using the inpainting tool.

Example Prompts

"Translate the text in this image to Japanese, keeping the original font style."
"Can you extract the text from this screenshot and save it to my clipboard?"
"Remove the Chinese watermark from this image and show me the clean version."

Tips & Limitations

Image Constraints: The API supports JPEG, PNG, WebP, and GIF formats, with a maximum file size of 10MB.
Security: The agent is configured to only process URLs provided by the user. Always verify the source before requesting the agent to fetch an external URL.
Model Selection: For general use, the default gemini-2.5-flash model is recommended. However, for specialized academic or creative translations, consider switching to gpt-5.1 or kimi-k2 via the configuration field.
Processing Time: High-resolution images may take a few seconds to process due to the inpainting and translation overhead. Ensure your internet connection is stable for large image uploads.

translate-image

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)