Fix Gateway Lock Timeout
Common Error Message
gateway already running (pid 659); lock timeout after 5000msThe Gateway uses exclusive TCP port binding to prevent multiple instances from running simultaneously. When a previous Gateway process didn't exit cleanly, the lock mechanism can get stuck — especially in Docker and containerized environments where PID reuse is common.
Next Step
Fix now, then reduce repeat incidents
If this issue keeps coming back, validate your setup in Doctor first, then harden your config.
Jump to Section
Quick Fix
openclaw gateway stop && openclaw gateway start
If that still shows the lock error, run the auto-fix:
openclaw doctor --fix
How Gateway Locking Works
The Gateway has two locking mechanisms:
1. TCP Port Binding (Primary)
On startup, the Gateway binds its WebSocket listener to ws://127.0.0.1:18789 exclusively. If another instance already holds the port, startup fails immediately. The OS automatically releases the port when the process exits — even on crashes or SIGKILL.
2. Session Lock Files (Secondary)
The Gateway also maintains .lock files on disk for session-level coordination. These files contain the owning PID. When the Gateway starts, it checks if the PID in the lock file is still alive. This is where the problem occurs in containers — PID reuse can fool the check.
Docker / Container Environments
This error is most common in Docker, Fly.io, and Railway deployments. Here's why:
Container runs Gateway at PID 42
Session lock file records PID 42 as the owner.
Container restarts
The process is killed, but the lock file persists on the volume.
New Gateway starts — also at PID 42
Containers often reuse PIDs. PID 42 is alive again.
Lock validation is fooled
The Gateway sees PID 42 is alive and thinks the old instance is still running. It waits for the lock to release — which never happens — and times out after 5-10 seconds.
Fix for Docker Compose
Clear lock files on container startup by adding a pre-start command:
services:
openclaw:
image: openclaw/openclaw:latest
# Clear stale locks before starting
command: sh -c "rm -f /data/.openclaw/sessions/*.lock && openclaw gateway start"
volumes:
- openclaw-data:/dataFix for Fly.io / Railway
#!/bin/sh # Remove stale session locks left by previous container find /data/.openclaw/sessions -name "*.lock" -delete 2>/dev/null exec openclaw gateway start
Manual Lock Cleanup
If you're running OpenClaw directly (not in Docker), you can manually remove stale lock files:
macOS / Linux
# Check for lock files ls -la ~/.openclaw/sessions/*.lock 2>/dev/null # Remove stale locks rm -f ~/.openclaw/sessions/*.lock # Restart Gateway openclaw gateway start
Windows
REM Check for lock files dir "%USERPROFILE%\.openclaw\sessions\*.lock" REM Remove stale locks del /Q "%USERPROFILE%\.openclaw\sessions\*.lock" REM Restart Gateway openclaw gateway start
Only Remove Locks When Gateway Is Stopped
If the Gateway is actually running (not a stale lock), removing its lock file can cause data corruption. Always run openclaw gateway stop first, or verify the PID in the lock file is dead.
Permanent Fix: Update OpenClaw
This issue was fixed in OpenClaw version 2026.1.23. The updated lock validation now:
Checks PID startTime and cmdline, not just whether the PID is alive
Detects "orphaned" locks where the PID matches the current process but isn't tracked in memory
Automatically removes stale locks on startup instead of waiting and timing out
npm install -g openclaw@latest && openclaw gateway restart
Seeing Other Errors Too?
Lock issues often come with these related problems:
Tired of finding out your Gateway crashed hours later?
Gateway Monitor pings your instance every 30 seconds and alerts you the moment it locks up — before your agents fail silently.
Fix It Faster With Our Tools
Did this guide solve your problem?