What causes gateway lock timeout in OpenClaw?

A stale gateway process or lock state is still active, so a new gateway instance cannot acquire startup locks.

What is the quickest recovery command?

Run openclaw gateway stop, clear stale process on port 18789, then start the gateway again.

Why is this common in containers?

Container restarts and PID reuse can leave stale lock state that looks valid to the next process.

Fix Gateway Lock Timeout

Common Error Message

gateway already running (pid 659); lock timeout after 5000ms

The Gateway uses exclusive TCP port binding to prevent multiple instances from running simultaneously. When a previous Gateway process didn't exit cleanly, the lock mechanism can get stuck — especially in Docker and containerized environments where PID reuse is common.

Related reports: #32933, #29601.

Quick Fix (3 steps)

1. openclaw gateway stop
2. Kill process on port 18789 (if still running): lsof -ti:18789 | xargs kill
3. openclaw gateway start

Next Step

Fix now, then reduce repeat incidents

If this issue keeps coming back, validate your setup in Doctor first, then harden your config.

Open Doctor Harden Config

Jump to Section

Quick Fix How Gateway Locking Works Docker / Container Fix Manual Lock Cleanup Permanent Fix (Update)

Quick Fix

Force restart the Gateway

openclaw gateway stop && openclaw gateway start

If that still shows the lock error, run the auto-fix:

Auto-diagnose and fix

openclaw doctor --fix

How Gateway Locking Works

The Gateway has two locking mechanisms:

1. TCP Port Binding (Primary)

On startup, the Gateway binds its WebSocket listener to ws://127.0.0.1:18789 exclusively. If another instance already holds the port, startup fails immediately. The OS automatically releases the port when the process exits — even on crashes or SIGKILL.

2. Session Lock Files (Secondary)

The Gateway also maintains .lock files on disk for session-level coordination. These files contain the owning PID. When the Gateway starts, it checks if the PID in the lock file is still alive. This is where the problem occurs in containers — PID reuse can fool the check.

Docker / Container Environments

This error is most common in Docker, Fly.io, and Railway deployments. Here's why:

Container runs Gateway at PID 42

Session lock file records PID 42 as the owner.

Container restarts

The process is killed, but the lock file persists on the volume.

New Gateway starts — also at PID 42

Containers often reuse PIDs. PID 42 is alive again.

Lock validation is fooled

The Gateway sees PID 42 is alive and thinks the old instance is still running. It waits for the lock to release — which never happens — and times out after 5-10 seconds.

Fix for Docker Compose

Clear lock files on container startup by adding a pre-start command:

docker-compose.yml

services:
  openclaw:
    image: openclaw/openclaw:latest
    # Clear stale locks before starting
    command: sh -c "rm -f /data/.openclaw/sessions/*.lock && openclaw gateway start"
    volumes:
      - openclaw-data:/data

Fix for Fly.io / Railway

Dockerfile entrypoint

#!/bin/sh
# Remove stale session locks left by previous container
find /data/.openclaw/sessions -name "*.lock" -delete 2>/dev/null
exec openclaw gateway start

Manual Lock Cleanup

If you're running OpenClaw directly (not in Docker), you can manually remove stale lock files:

macOS / Linux

Remove session locks

# Check for lock files
ls -la ~/.openclaw/sessions/*.lock 2>/dev/null

# Remove stale locks
rm -f ~/.openclaw/sessions/*.lock

# Restart Gateway
openclaw gateway start

Windows

Remove session locks

REM Check for lock files
dir "%USERPROFILE%\.openclaw\sessions\*.lock"

REM Remove stale locks
del /Q "%USERPROFILE%\.openclaw\sessions\*.lock"

REM Restart Gateway
openclaw gateway start

Only Remove Locks When Gateway Is Stopped

If the Gateway is actually running (not a stale lock), removing its lock file can cause data corruption. Always run openclaw gateway stop first, or verify the PID in the lock file is dead.

Permanent Fix: Update OpenClaw

This issue was fixed in OpenClaw version 2026.1.23. The updated lock validation now:

Checks PID startTime and cmdline, not just whether the PID is alive

Detects "orphaned" locks where the PID matches the current process but isn't tracked in memory

Automatically removes stale locks on startup instead of waiting and timing out

Update to latest OpenClaw

npm install -g openclaw@latest && openclaw gateway restart

Seeing Other Errors Too?

Lock issues often come with these related problems:

Fix It Faster With Our Tools

Config Wizard

Generate a production-ready clawhub.json in 30 seconds.

Local Doctor

Diagnose Node.js, permissions, and config issues instantly.

Cost Simulator

Calculate your agent burn rate before you get surprised.

Skill Finder

Describe your use case and find the right Claude Code skill instantly.

Did this guide solve your problem?