ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified ai models Safety 4/5

alicloud-ai-audio-cosyvoice-voice-clone

Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from reference audio and then reusing the returned voice_id in later TTS calls.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/cinience/alicloud-ai-audio-cosyvoice-voice-clone
Or

What This Skill Does

The alicloud-ai-audio-cosyvoice-voice-clone skill provides an interface to the Alibaba Cloud Model Studio CosyVoice customization service. It enables users to perform voice enrollment, allowing the creation of high-fidelity cloned voices from provided reference audio samples. Once a reference audio sample is submitted, the service processes the voice signature and returns a unique voice_id. This voice_id is essential for subsequent text-to-speech (TTS) tasks within the OpenClaw ecosystem, ensuring that generated speech consistently maintains the cloned voice profile. The skill supports advanced models like cosyvoice-v3.5-plus and cosyvoice-v3.5-flash, which provide state-of-the-art voice synthesis capabilities.

Installation

To install this skill, use the OpenClaw command-line interface: clawhub install openclaw/skills/skills/cinience/alicloud-ai-audio-cosyvoice-voice-clone

Ensure you have configured your authentication credentials by setting the DASHSCOPE_API_KEY environment variable or by adding your credentials to the ~/.alibabacloud/credentials file. This is a strict prerequisite for successful API communication with the Alibaba Cloud backend.

Use Cases

This skill is designed for scenarios requiring personalized audio experiences, such as:

  • Creating custom AI avatars that speak with a specific, recognizable human voice.
  • Generating consistent brand voices for corporate videos, product demonstrations, or interactive customer service agents.
  • Developing localized audio content where a specific speaker's tone and prosody must be preserved across different languages.
  • Prototyping voice-interactive applications by cloning specific personas for character testing.

Example Prompts

  1. "Clone the voice from this sample audio at https://example.com/speaker1.wav for my project, use the cosyvoice-v3.5-plus model, and assign it the prefix 'myBrandVoice'."
  2. "I need a new voice clone for Chinese language content. Please use the sample at https://example.com/audio.wav with the cosyvoice-v3.5-flash model and set the language hint to 'zh'."
  3. "Enroll a new custom voice with the prefix 'agent01' using the audio file located at https://example.com/reference.mp3 for the latest cosyvoice-v3.5-plus model."

Tips & Limitations

  • Regional Requirements: Be aware that high-end models like cosyvoice-v3.5-plus are currently limited to the China mainland deployment. Ensure your regional settings match the requirements of the model you select.
  • Voice Consistency: Always use the same target_model during the enrollment phase and the subsequent TTS synthesis phase; mismatching these will result in operational failures.
  • Quota Management: Each enrollment consumes credits. Avoid unnecessary API calls by reusing existing voice_id tokens once a voice has been successfully cloned.
  • Audio Quality: Ensure the reference audio provided is high quality, clear, and without significant background noise to maximize the fidelity of the final cloned voice.

Metadata

Author@cinience
Stars3562
Views1
Updated2026-03-29
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-cinience-alicloud-ai-audio-cosyvoice-voice-clone": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#audio#voice-cloning#tts#alicloud#cosyvoice
Safety Score: 4/5

Flags: network-access, external-api