Why does curl to my local LLM work but the OpenClaw embedded agent times out?

The embedded agent runs inside the OpenClaw gateway process, which may bind to a different network interface than your terminal. If your LLM server listens on 0.0.0.0 but the gateway resolves "localhost" differently, or if there is a firewall rule blocking the gateway process, requests from the agent can timeout while curl from the terminal succeeds.

What is the difference between --local mode and a local OpenAI-compatible provider?

The --local flag tells the embedded agent to skip cloud models entirely. A local OpenAI-compatible provider means any server (LM Studio, llama.cpp, vLLM) that implements the OpenAI API spec. Both aim for local-only inference but configure it differently in openclaw.json.

How long should I set the timeout for a slow local LLM?

For large models running on CPU, set the timeout to at least 120-300 seconds. Use the openclaw.json timeout field: "embedded": { "timeout": 300000 } — the value is in milliseconds.

Fix: Embedded Agent Times Out on Local OpenAI-Compatible LLM

openclaw agent --agent sage --local --message "Hello" --timeout 90
Error: embedded run timeout (after 90000ms)
Failover to cloud provider...

The OpenClaw embedded agent times out trying to connect to your local OpenAI-compatible LLM server — even when a direct curl call to the same endpoint succeeds from your terminal. This is almost always a binding, network resolution, or model configuration issue inside the gateway process rather than a problem with your LLM server itself.

Next Step

Fix now, then reduce repeat incidents

If this issue keeps coming back, validate your setup in Doctor first, then harden your config.

Open Doctor Harden Config

Jump to Fix

Test LLM reachability from gateway Fix LLM server binding address Configure local provider in openclaw.json Increase embedded agent timeout Disable cloud failover

Step 1: Test Reachability from the Gateway Process

The embedded agent makes HTTP requests from within the gateway process. Test that the endpoint is reachable using the same network path:

Test LLM endpoint from Node.js (same as gateway)

# Does it respond at all?
curl -s http://localhost:1234/v1/models

# Test from Node.js (same stack as OpenClaw)
node -e "fetch('http://localhost:1234/v1/models').then(r=>r.json()).then(d=>console.log(JSON.stringify(d.data?.length,'  '),'models')).catch(e=>console.error('FAIL:',e.cause?.code))"

Step 2: Ensure Your LLM Server Binds to the Right Address

Some LLM servers bind only to 127.0.0.1 by default. If the OpenClaw gateway runs in a container or with a different network namespace, localhost may resolve differently. Bind your LLM server explicitly to 0.0.0.0 or use the exact IP:

LM Studio / llama.cpp server binding

# LM Studio: set listen address in settings to 0.0.0.0

# llama.cpp server
./server --host 0.0.0.0 --port 1234 -m model.gguf

# vLLM
python -m vllm.entrypoints.openai.api_server   --host 0.0.0.0 --port 8000 --model mistralai/Mistral-7B-v0.1

Step 3: Configure Local Provider in openclaw.json

Local OpenAI-compatible provider config

{
  "model": "local/mistral-7b",
  "providers": {
    "local": {
      "baseUrl": "http://127.0.0.1:1234/v1",
      "apiKey": "not-needed"
    }
  },
  "embedded": {
    "timeout": 120000,
    "preferLocal": true
  }
}

The model name must match what your local server reports in the/v1/models response. Check the model ID with curl http://localhost:1234/v1/models | jq '.data[].id'

Step 4: Increase the Embedded Agent Timeout

Large models running on CPU can take 30–120 seconds to respond. The default timeout is 30 seconds. Increase it to match your hardware:

Increase timeout (ms) in openclaw.json

{
  "embedded": {
    "timeout": 300000
  }
}

Or pass it on the command line:

Pass timeout via CLI

openclaw agent --agent sage --local --timeout 300 --message "Hello"

Step 5: Disable Cloud Failover

If you want to use only the local LLM and have the agent fail rather than fall back to a cloud model:

Disable cloud failover

{
  "embedded": {
    "preferLocal": true,
    "allowCloudFallback": false
  }
}

After applying the config, restart the gateway and run a test prompt. A successful response within the timeout window confirms the connection is working.

Fix It Faster With Our Tools

Config Wizard

Generate a production-ready clawhub.json in 30 seconds.

Local Doctor

Diagnose Node.js, permissions, and config issues instantly.

Cost Simulator

Calculate your agent burn rate before you get surprised.

Skill Finder

Describe your use case and find the right Claude Code skill instantly.

Did this guide solve your problem?

Fix: Embedded Agent Times Out on Local OpenAI-Compatible LLM

Fix now, then reduce repeat incidents

Jump to Fix

Step 1: Test Reachability from the Gateway Process

Step 2: Ensure Your LLM Server Binds to the Right Address

Step 3: Configure Local Provider in openclaw.json

Step 4: Increase the Embedded Agent Timeout

Step 5: Disable Cloud Failover

Related local LLM issues

Ollama Remote LLM Timeout

Ollama Local Model Never Responds

No API Key Found

Cost Estimator

Fix It Faster With Our Tools

Config Wizard

Local Doctor

Cost Simulator

Skill Finder

Next: Make Your Agent Even Better