rocm_vllm_deployment
Production-ready vLLM deployment on AMD ROCm GPUs. Combines environment auto-check, model parameter detection, Docker Compose deployment, health verification, and functional testing with comprehensive logging and security best practices.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/alexhegit/rocm-vllm-deploymentWhat This Skill Does
The rocm_vllm_deployment skill provides a production-grade automation pipeline for deploying vLLM inference services specifically on AMD ROCm-accelerated infrastructure. It abstracts the complexities of container orchestration, VRAM optimization, and dependency management. The skill performs an automated pre-flight check of the host environment, calculates necessary memory overhead, and dynamically adjusts vLLM engine parameters based on the specific model architecture detected via config.json. By leveraging Docker Compose, it ensures a reproducible and isolated environment, while simultaneously generating human-readable deployment reports and verifying service health through automated functional tests. It is designed to handle secure token management without exposing sensitive credentials in persistent configuration files, adhering to industry security best practices.
Installation
To install this skill, use the OpenClaw command-line interface:
clawhub install openclaw/skills/skills/alexhegit/rocm-vllm-deployment
Ensure your host system meets the ROCm driver requirements before initiation. It is highly recommended to configure your ~/.bash_profile with your HF_TOKEN for gated model access prior to running the deployment tasks to avoid mid-process interruptions.
Use Cases
- Rapid prototyping of LLM inference services on AMD hardware.
- Automated CI/CD pipelines for deploying fine-tuned models in production environments.
- Standardizing GPU resource allocation across multiple model deployments by leveraging VRAM estimation logic.
- Monitoring and validating model health post-deployment through automated functional test suites included within the skill package.
Example Prompts
- "Deploy the Llama-3-8B-Instruct model on the ROCm cluster, ensure auto-scaling is enabled, and generate a performance report upon completion."
- "Check the current environment dependencies for vLLM and deploy Mistral-7B-v0.3 if the VRAM requirements are met."
- "Run a health verification on the existing vLLM container for the Phi-3 model and perform a functional test query."
Tips & Limitations
- Tips: Always run the
check-env.shscript prior to large deployments to identify missing ROCm dependencies early. For multi-GPU setups, ensure that the Docker container has access to the correct KFD (Kernel Fusion Driver) devices. - Limitations: The skill is optimized for ROCm-compatible hardware; performance on non-AMD platforms is not guaranteed or supported. While the skill detects model parameters, manual overrides may be necessary for extremely high-context-window requirements.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-alexhegit-rocm-vllm-deployment": {
"enabled": true,
"auto_update": true
}
}
}Tags
Flags: file-read, file-write, code-execution, network-access
Related Skills
onlyclaw-social-commerce
在只来龙虾平台以龙虾身份自动发帖带货、读取帖子、检索帖子、点赞评论,支持关联商品/店铺/Skill、封面与视频(先上传再发帖),实现 AI Agent 24h 社交电商自动运营
autodream-core
通用记忆整理引擎 — 基于适配器模式的跨平台记忆整理技能。自动去重、合并、删除过时条目。| Universal Memory Consolidation Engine — Adapter-based cross-platform memory organization. Auto-dedup, merge, prune stale entries.
daily-report-generator
Automatically generate daily/weekly work reports from git commits, calendar events, and task lists. Use when you need to quickly create professional work reports without manual effort.
sealvera
Tamper-evident audit trail for AI agent decisions. Use when logging LLM decisions, setting up AI compliance, auditing agents for EU AI Act, HIPAA, GDPR or SOC 2, or when a user asks about AI decision audit trails, explainability, or SealVera.
Lead Radar
Every morning, scans Reddit, Hacker News, Indie Hackers, Stack Overflow, Quora, Hashnode, Dev.to, GitHub, and Lobsters for people actively asking for what you sell. Delivers the top 10 buying-intent leads to your Telegram with a pre-drafted reply. Powered by Gemini 2.5 Flash.