ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified ai models Safety 4/5

rocm_vllm_deployment

Production-ready vLLM deployment on AMD ROCm GPUs. Combines environment auto-check, model parameter detection, Docker Compose deployment, health verification, and functional testing with comprehensive logging and security best practices.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/alexhegit/rocm-vllm-deployment
Or

What This Skill Does

The rocm_vllm_deployment skill provides a production-grade automation pipeline for deploying vLLM inference services specifically on AMD ROCm-accelerated infrastructure. It abstracts the complexities of container orchestration, VRAM optimization, and dependency management. The skill performs an automated pre-flight check of the host environment, calculates necessary memory overhead, and dynamically adjusts vLLM engine parameters based on the specific model architecture detected via config.json. By leveraging Docker Compose, it ensures a reproducible and isolated environment, while simultaneously generating human-readable deployment reports and verifying service health through automated functional tests. It is designed to handle secure token management without exposing sensitive credentials in persistent configuration files, adhering to industry security best practices.

Installation

To install this skill, use the OpenClaw command-line interface: clawhub install openclaw/skills/skills/alexhegit/rocm-vllm-deployment

Ensure your host system meets the ROCm driver requirements before initiation. It is highly recommended to configure your ~/.bash_profile with your HF_TOKEN for gated model access prior to running the deployment tasks to avoid mid-process interruptions.

Use Cases

  • Rapid prototyping of LLM inference services on AMD hardware.
  • Automated CI/CD pipelines for deploying fine-tuned models in production environments.
  • Standardizing GPU resource allocation across multiple model deployments by leveraging VRAM estimation logic.
  • Monitoring and validating model health post-deployment through automated functional test suites included within the skill package.

Example Prompts

  1. "Deploy the Llama-3-8B-Instruct model on the ROCm cluster, ensure auto-scaling is enabled, and generate a performance report upon completion."
  2. "Check the current environment dependencies for vLLM and deploy Mistral-7B-v0.3 if the VRAM requirements are met."
  3. "Run a health verification on the existing vLLM container for the Phi-3 model and perform a functional test query."

Tips & Limitations

  • Tips: Always run the check-env.sh script prior to large deployments to identify missing ROCm dependencies early. For multi-GPU setups, ensure that the Docker container has access to the correct KFD (Kernel Fusion Driver) devices.
  • Limitations: The skill is optimized for ROCm-compatible hardware; performance on non-AMD platforms is not guaranteed or supported. While the skill detects model parameters, manual overrides may be necessary for extremely high-context-window requirements.

Metadata

Author@alexhegit
Stars4473
Views4
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-alexhegit-rocm-vllm-deployment": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#llm#deployment#amd#rocm#docker compose#vllm#automation#envcheck#autorepair
Safety Score: 4/5

Flags: file-read, file-write, code-execution, network-access