ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

voice-note-to-midi

Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing

skill-install β€” Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/danbennettuk/voice-note-to-midi
Or

🎡 Voice Note to MIDI

Transform your voice memos, humming, and melodic recordings into clean, quantized MIDI files ready for your DAW.

What It Does

This skill provides a complete audio-to-MIDI conversion pipeline that:

  1. Stem Separation - Uses HPSS (Harmonic-Percussive Source Separation) to isolate melodic content from drums, noise, and background sounds
  2. ML-Powered Pitch Detection - Leverages Spotify's Basic Pitch model for accurate fundamental frequency extraction
  3. Key Detection - Automatically detects the musical key of your recording using Krumhansl-Kessler key profiles
  4. Intelligent Quantization - Snaps notes to a configurable timing grid with optional key-aware pitch correction
  5. Post-Processing - Applies octave pruning, overlap-based harmonic removal, and legato note merging for clean output

Pipeline Architecture

Audio Input (WAV/M4A/MP3)
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 1: Stem Separation (HPSS)     β”‚
β”‚ - Isolate harmonic content          β”‚
β”‚ - Remove drums/percussion           β”‚
β”‚ - Noise gating                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 2: Pitch Detection             β”‚
β”‚ - Basic Pitch ML model (Spotify)    β”‚
β”‚ - Polyphonic note detection         β”‚
β”‚ - Onset/offset estimation           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 3: Analysis                    β”‚
β”‚ - Pitch class distribution          β”‚
β”‚ - Key detection                     β”‚
β”‚ - Dominant note identification      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Step 4: Quantization & Cleanup      β”‚
β”‚ - Timing grid snap                  β”‚
β”‚ - Key-aware pitch correction        β”‚
β”‚ - Octave pruning (harmonic removal) β”‚
β”‚ - Overlap-based pruning             β”‚
β”‚ - Note merging (legato)             β”‚
β”‚ - Velocity normalization            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓
MIDI Output (Standard MIDI File)

Setup

Prerequisites

  • Python 3.11+ (Python 3.14+ recommended)
  • FFmpeg (for audio format support)
  • pip

Installation

Quick Install (Recommended):

cd /path/to/voice-note-to-midi
./setup.sh

This automated script will:

  • Check Python 3.11+ is installed
  • Create the ~/melody-pipeline directory
  • Set up the virtual environment
  • Install all dependencies (basic-pitch, librosa, music21, etc.)
  • Download and configure the hum2midi script
  • Add melody-pipeline to your PATH

Manual Install:

If you prefer manual setup:

mkdir -p ~/melody-pipeline
cd ~/melody-pipeline
python3 -m venv venv-bp
source venv-bp/bin/activate
pip install basic-pitch librosa soundfile mido music21
chmod +x ~/melody-pipeline/hum2midi
  1. Add to your PATH (optional):
echo 'export PATH="$HOME/melody-pipeline:$PATH"' >> ~/.bashrc
source ~/.bashrc

Verify Installation

Metadata

Stars2387
Views0
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build β€” we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-danbennettuk-voice-note-to-midi": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#audio#midi#music#transcription#machine-learning
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.