Official Verified ai models Safety 4/5

rocm_vllm_deployment

Production-ready vLLM deployment on AMD ROCm GPUs. Combines environment auto-check, model parameter detection, Docker Compose deployment, health verification, and functional testing with comprehensive logging and security best practices.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/alexhegit/rocm-vllm-deployment

Download Source Code (.zip)

What This Skill Does

The rocm_vllm_deployment skill provides a production-grade automation pipeline for deploying vLLM inference services specifically on AMD ROCm-accelerated infrastructure. It abstracts the complexities of container orchestration, VRAM optimization, and dependency management. The skill performs an automated pre-flight check of the host environment, calculates necessary memory overhead, and dynamically adjusts vLLM engine parameters based on the specific model architecture detected via config.json. By leveraging Docker Compose, it ensures a reproducible and isolated environment, while simultaneously generating human-readable deployment reports and verifying service health through automated functional tests. It is designed to handle secure token management without exposing sensitive credentials in persistent configuration files, adhering to industry security best practices.

Installation

To install this skill, use the OpenClaw command-line interface: clawhub install openclaw/skills/skills/alexhegit/rocm-vllm-deployment

Ensure your host system meets the ROCm driver requirements before initiation. It is highly recommended to configure your ~/.bash_profile with your HF_TOKEN for gated model access prior to running the deployment tasks to avoid mid-process interruptions.

Use Cases

Rapid prototyping of LLM inference services on AMD hardware.
Automated CI/CD pipelines for deploying fine-tuned models in production environments.
Standardizing GPU resource allocation across multiple model deployments by leveraging VRAM estimation logic.
Monitoring and validating model health post-deployment through automated functional test suites included within the skill package.

Example Prompts

"Deploy the Llama-3-8B-Instruct model on the ROCm cluster, ensure auto-scaling is enabled, and generate a performance report upon completion."
"Check the current environment dependencies for vLLM and deploy Mistral-7B-v0.3 if the VRAM requirements are met."
"Run a health verification on the existing vLLM container for the Phi-3 model and perform a functional test query."

Tips & Limitations

Tips: Always run the check-env.sh script prior to large deployments to identify missing ROCm dependencies early. For multi-GPU setups, ensure that the Docker container has access to the correct KFD (Kernel Fusion Driver) devices.
Limitations: The skill is optimized for ROCm-compatible hardware; performance on non-AMD platforms is not guaranteed or supported. While the skill detects model parameters, manual overrides may be necessary for extremely high-context-window requirements.

Read Full Documentation on GitHub

Metadata

Author@alexhegit

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-alexhegit-rocm-vllm-deployment": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Related Skills

onlyclaw-social-commerce

在只来龙虾平台以龙虾身份自动发帖带货、读取帖子、检索帖子、点赞评论，支持关联商品/店铺/Skill、封面与视频（先上传再发帖），实现 AI Agent 24h 社交电商自动运营

azhangwq-bit 4473

autodream-core

通用记忆整理引擎 — 基于适配器模式的跨平台记忆整理技能。自动去重、合并、删除过时条目。| Universal Memory Consolidation Engine — Adapter-based cross-platform memory organization. Auto-dedup, merge, prune stale entries.

bigkingcn 4473

daily-report-generator

Automatically generate daily/weekly work reports from git commits, calendar events, and task lists. Use when you need to quickly create professional work reports without manual effort.

1989tianlong 4473

sealvera

Tamper-evident audit trail for AI agent decisions. Use when logging LLM decisions, setting up AI compliance, auditing agents for EU AI Act, HIPAA, GDPR or SOC 2, or when a user asks about AI decision audit trails, explainability, or SealVera.

ahessami123 4473

Lead Radar

Every morning, scans Reddit, Hacker News, Indie Hackers, Stack Overflow, Quora, Hashnode, Dev.to, GitHub, and Lobsters for people actively asking for what you sell. Delivers the top 10 buying-intent leads to your Telegram with a pre-drafted reply. Powered by Gemini 2.5 Flash.

bencpnd 4473