songsee
Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.
Install via CLI (Recommended)
clawhub install openclaw/openclaw/skills/songseeWhat This Skill Does
The songsee skill is a powerful command-line interface tool designed to bridge the gap between raw audio files and visual data representation. It allows users to generate high-fidelity spectrograms and advanced feature panels directly from audio files like MP3 or WAV. By providing native support for complex audio processing, songsee creates visual summaries of musical characteristics—including chroma, mel-frequency cepstral coefficients (MFCCs), harmonic-percussive source separation (HPSS), and tempograms. This is an essential utility for audio engineers, music producers, and data scientists who need to visualize rhythmic, tonal, or spectral content without opening heavy Digital Audio Workstations (DAWs).
Installation
To integrate this tool into your OpenClaw environment, use the provided installer:
clawhub install openclaw/openclaw/skills/songsee
Ensure that you have ffmpeg installed on your system if you intend to process audio formats other than native WAV or MP3, as songsee leverages ffmpeg for wider codec support.
Use Cases
- Music Production: Quickly identify frequency masking issues or check the harmonic balance of a mix by analyzing the spectrogram and chroma panels.
- Music Information Retrieval (MIR): Extract temporal features like tempograms and flux to analyze beat patterns and rhythmic structure in large datasets.
- Audio Archiving: Generate visual thumbnails for large audio libraries to make browsing files more intuitive.
- Academic Research: Visualize spectral data for psychoacoustic studies or sound design experiments.
Example Prompts
- "Generate a detailed 10-second spectral visualization of the file 'drum_loop.wav' focusing on the low-end frequencies between 20Hz and 500Hz."
- "Create a multi-panel visualization for 'piano_concerto.mp3' containing the mel spectrogram, chroma, and tempogram, and save the result as a high-quality PNG."
- "Analyze the last 30 seconds of 'podcast_clip.mp3' and render the result using the magma color palette to highlight amplitude variations."
Tips & Limitations
- Performance: When requesting a large number of visualizations via the
--vizflag, the rendering time will increase. For long files, consider using the--startand--durationflags to minimize compute load. - Color Mapping: Use the
--styleflag (e.g., magma, viridis) to ensure the visualization is readable based on your display or accessibility requirements. - Format Handling: Always prioritize WAV files for maximum accuracy during analysis, as compressed formats like MP3 may introduce artifacts in high-frequency spectral data due to lossy encoding.
- Flexibility: The skill supports piping audio through standard input, which is ideal for automated pipelines or shell-based workflows, but ensure your system has sufficient memory allocated for large file buffers.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-openclaw-songsee": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write
Related Skills
apple-notes
Create, view, edit, delete, search, move, or export Apple Notes via the memo CLI on macOS.
sherpa-onnx-tts
Local text-to-speech via sherpa-onnx (offline, no cloud)
goplaces
Query Google Places for text search, place details, resolve, reviews, or scriptable JSON via goplaces.
skill-creator
Create, edit, improve, tidy, review, audit, or restructure AgentSkills and SKILL.md files.
video-frames
Extract frames or short clips from videos using ffmpeg.