ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

voice-note-to-midi

Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing

Why use this skill?

Transform your humming, voice memos, and melodic audio into quantized MIDI files for your DAW using ML-powered pitch detection and intelligent post-processing.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/danbennettuk/voice-note-to-midi
Or

What This Skill Does

The voice-note-to-midi skill is a sophisticated audio processing pipeline designed to bridge the gap between human musical expression and digital music production. By leveraging advanced machine learning, specifically Spotify's 'Basic Pitch' model, this skill converts raw vocal recordings, humming, or melodic audio into structured MIDI data. The process begins with harmonic-percussive source separation (HPSS) to clean up your input, isolating the melodic essence from background noise and transient percussive elements. Once the melody is isolated, the ML model performs precise pitch detection. The skill then applies a layer of intelligent post-processing, including Krumhansl-Kessler based key detection and automatic quantization, which snaps notes to a musical grid. Further refinements such as octave pruning, legato note merging, and velocity normalization ensure the resulting MIDI file is not just accurate, but musical and ready for drag-and-drop integration into your Digital Audio Workstation (DAW).

Installation

To install, ensure your system has Python 3.11+ and FFmpeg installed. The easiest way to get started is to use the OpenClaw skill manager: clawhub install openclaw/skills/skills/danbennettuk/voice-note-to-midi. Alternatively, you can follow the manual path by cloning the source repository into ~/melody-pipeline, creating a virtual environment, and installing the required Python dependencies including basic-pitch, librosa, and music21. Once installed, adding the directory to your PATH allows you to invoke the pipeline directly from your terminal or via the OpenClaw agent.

Use Cases

  • Songwriting: Capture fleeting melodic ideas while on the go and turn them into MIDI to keep as project files.
  • Transcribing: Quickly convert a recorded vocal melody into notation or MIDI for analysis.
  • Workflow Acceleration: Eliminate the manual labor of MIDI programming by 'singing' your synth lines, basslines, or vocal leads directly into your DAW.
  • Musical Ideation: Experiment with voice-led compositions that can be re-synthesized using virtual instruments.

Example Prompts

  • "Convert my voice memo 'idea_01.mp3' into a MIDI file and quantize it to 16th notes."
  • "Process the recording of me humming this bassline and output a MIDI file named 'new_bassline.mid'."
  • "Take the latest audio file from my recordings folder, extract the melody, and snap it to C Major."

Tips & Limitations

For best results, record in a quiet environment with minimal background interference. While the HPSS separation is powerful, extremely noisy environments may introduce artifacts. Ensure your singing is relatively rhythmic for the best quantization results. Note that the skill is optimized for monophonic melodies; highly complex polyphonic or chordal voice notes may experience diminished accuracy during the pitch detection phase.

Metadata

Stars3376
Views4
Updated2026-03-24
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-danbennettuk-voice-note-to-midi": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#audio#midi#music#transcription#machine-learning
Safety Score: 4/5

Flags: file-write, file-read, code-execution