ClawKit Logo
ClawKitReliability Toolkit

Fix Gateway Lock Timeout

Common Error Message

gateway already running (pid 659); lock timeout after 5000ms

The Gateway uses exclusive TCP port binding to prevent multiple instances from running simultaneously. When a previous Gateway process didn't exit cleanly, the lock mechanism can get stuck — especially in Docker and containerized environments where PID reuse is common.

Next Step

Fix now, then reduce repeat incidents

If this issue keeps coming back, validate your setup in Doctor first, then harden your config.

Quick Fix

Force restart the Gateway
openclaw gateway stop && openclaw gateway start

If that still shows the lock error, run the auto-fix:

Auto-diagnose and fix
openclaw doctor --fix

How Gateway Locking Works

The Gateway has two locking mechanisms:

1. TCP Port Binding (Primary)

On startup, the Gateway binds its WebSocket listener to ws://127.0.0.1:18789 exclusively. If another instance already holds the port, startup fails immediately. The OS automatically releases the port when the process exits — even on crashes or SIGKILL.

2. Session Lock Files (Secondary)

The Gateway also maintains .lock files on disk for session-level coordination. These files contain the owning PID. When the Gateway starts, it checks if the PID in the lock file is still alive. This is where the problem occurs in containers — PID reuse can fool the check.

Docker / Container Environments

This error is most common in Docker, Fly.io, and Railway deployments. Here's why:

1

Container runs Gateway at PID 42

Session lock file records PID 42 as the owner.

2

Container restarts

The process is killed, but the lock file persists on the volume.

3

New Gateway starts — also at PID 42

Containers often reuse PIDs. PID 42 is alive again.

4

Lock validation is fooled

The Gateway sees PID 42 is alive and thinks the old instance is still running. It waits for the lock to release — which never happens — and times out after 5-10 seconds.

Fix for Docker Compose

Clear lock files on container startup by adding a pre-start command:

docker-compose.yml
services:
  openclaw:
    image: openclaw/openclaw:latest
    # Clear stale locks before starting
    command: sh -c "rm -f /data/.openclaw/sessions/*.lock && openclaw gateway start"
    volumes:
      - openclaw-data:/data

Fix for Fly.io / Railway

Dockerfile entrypoint
#!/bin/sh
# Remove stale session locks left by previous container
find /data/.openclaw/sessions -name "*.lock" -delete 2>/dev/null
exec openclaw gateway start

Manual Lock Cleanup

If you're running OpenClaw directly (not in Docker), you can manually remove stale lock files:

macOS / Linux

Remove session locks
# Check for lock files
ls -la ~/.openclaw/sessions/*.lock 2>/dev/null

# Remove stale locks
rm -f ~/.openclaw/sessions/*.lock

# Restart Gateway
openclaw gateway start

Windows

Remove session locks
REM Check for lock files
dir "%USERPROFILE%\.openclaw\sessions\*.lock"

REM Remove stale locks
del /Q "%USERPROFILE%\.openclaw\sessions\*.lock"

REM Restart Gateway
openclaw gateway start

Only Remove Locks When Gateway Is Stopped

If the Gateway is actually running (not a stale lock), removing its lock file can cause data corruption. Always run openclaw gateway stop first, or verify the PID in the lock file is dead.

Permanent Fix: Update OpenClaw

This issue was fixed in OpenClaw version 2026.1.23. The updated lock validation now:

Checks PID startTime and cmdline, not just whether the PID is alive

Detects "orphaned" locks where the PID matches the current process but isn't tracked in memory

Automatically removes stale locks on startup instead of waiting and timing out

Update to latest OpenClaw
npm install -g openclaw@latest && openclaw gateway restart

Tired of finding out your Gateway crashed hours later?

Gateway Monitor pings your instance every 30 seconds and alerts you the moment it locks up — before your agents fail silently.

Get Early Access

Did this guide solve your problem?

Need Help?

Try our automated tools to solve common issues instantly.