gemini-assistant
General-purpose AI assistant using Gemini API with voice and text support. Use when you need a smart AI assistant that can answer questions, have conversations, or help with general tasks using Google's Gemini models with audio/text capabilities.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/alimostafaradwan/gemini-assistantWhat This Skill Does
The Gemini Assistant skill provides a robust interface for interacting with Google's Gemini models directly through your OpenClaw agent. Designed for versatility, this skill acts as a bridge to advanced generative AI capabilities, allowing for seamless transitions between text-based chat and voice-interactive conversations. By leveraging the Gemini 2.5-flash-native-audio-preview model, the assistant can process audio files, transcribe spoken input, and generate natural-sounding voice responses, making it an ideal companion for hands-free productivity or complex querying. Beyond standard chat, it is highly configurable, allowing users to fine-tune system instructions for specialized tasks, ensuring that the assistant remains context-aware and helpful within the OpenClaw ecosystem.
Installation
To integrate this skill into your workflow, you first need to ensure the required dependencies are installed, including the Google GenAI SDK, NumPy, and FFmpeg for audio transcoding. Use the OpenClaw command-line interface to add the skill directly from the repository:
clawhub install openclaw/skills/skills/alimostafaradwan/gemini-assistant
After installation, navigate to the skill directory at ~/.openclaw/agents/kashif/skills/gemini-assistant. You must provide your Google Gemini API key to authorize requests. This can be achieved by exporting the GEMINI_API_KEY environment variable in your shell session or by creating a .env file within the skill's root folder. Once configured, you can invoke the agent using the provided handler.py script for either text or voice operations.
Use Cases
This skill is perfect for users who need a dynamic AI assistant capable of multitasking. Use it to summarize lengthy project documentation, draft emails, or brainstorm creative ideas while away from your keyboard. Its voice integration is particularly useful for users working in fast-paced environments or those who prefer spoken interaction. Because it supports system-level instructions, you can also use it to perform structured data extraction or to act as a specialized coding pair programmer for your development tasks.
Example Prompts
- "Summarize the following project requirements and list the three most critical tasks that need completion by the end of this week."
- [Voice Input] "I'm currently brainstorming a new feature for the OpenClaw plugin, can you help me outline the architecture using the Google Gemini model?"
- "Explain the concept of container orchestration as if I were a junior developer looking to understand Docker Swarm vs Kubernetes."
Tips & Limitations
To get the best performance, ensure your API key has appropriate quota limits enabled in the Google AI Studio console. For heavy audio tasks, consider that processing time depends on the length of the file and network latency to Google's servers. If you encounter issues with audio conversion, double-check that FFmpeg is installed and accessible in your system's PATH. Note that the default model is optimized for native audio, but if you require strictly text-based interaction, switching the MODEL variable in handler.py to a more cost-effective text-only variant can improve speed and reduce resource overhead.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-alimostafaradwan-gemini-assistant": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: external-api, file-read, file-write