| .. | ||
| assets | ||
| endpoint-production.yaml | ||
| endpoint-test.yaml | ||
| overview.md | ||
| README.md | ||
DGX Station AI Skills for Coding Agents
Give your coding agent (Claude Code, Codex, Gemini CLI, Cursor) DGX Station expertise via an AGENTS.md and on-demand Agent Skills
Table of Contents
Overview
Basic idea
Modern coding agents — Claude Code, OpenAI Codex CLI, Gemini CLI, Cursor — all support two extension mechanisms: a project-level context file that's loaded into every conversation, and on-demand procedural workflows (called skills, prompts, commands, or rules depending on the harness). This playbook ships both for DGX Station:
- An
AGENTS.mdwith the critical DGX Station constraints your agent should always know (mixed coherency, GPU targeting, common pitfalls).AGENTS.mdis the cross-harness standard; aninstall.shlays it down asCLAUDE.md,GEMINI.md, orAGENTS.mddepending on the agent you use. - Four Agent Skills —
vllm-setup,sglang-setup,mig-configure,dgx-diagnose— authored once in the Anthropic Agent Skills format and installed into the right per-harness location (.claude/skills/,.codex/prompts/,.gemini/commands/, or.cursor/rules/).
This approach keeps your agent's context lean in every conversation while giving it deep procedural knowledge on demand, regardless of which agent you use.
AGENTS.md vs Agent Skill — why split?
| AGENTS.md | Agent Skill | |
|---|---|---|
| Loaded | Every conversation, automatically | Only when invoked by name (or matched by description, in Claude) |
| Best for | Constraints, pitfalls, "never do X" rules | Step-by-step workflows, deployment procedures |
| Context cost | Consumed every time | Zero until invoked |
The DGX Station mixed-coherency constraint (--gpus all will crash) should be in every conversation. The full vLLM deployment procedure should not.
What you'll accomplish
- Install the
AGENTS.mdand four Agent Skills into your project directory for your chosen agent (Claude Code, Codex, Gemini CLI, or Cursor). - Verify the agent loads the constraints automatically and the skills on demand.
- Invoke
vllm-setupto deploy a vLLM inference server with validated configuration. - Invoke
sglang-setupto deploy an SGLang inference server. - Invoke
mig-configureto partition the GB300 into MIG instances. - Invoke
dgx-diagnoseto troubleshoot common DGX Station issues.
What to know before starting
- Basic familiarity with one supported coding agent (running it, giving it prompts, using slash commands or rule references)
- General understanding of DGX Station (two GPUs, Docker-based workflows)
Prerequisites
- NVIDIA DGX Station with GB300
- One of the supported coding agents installed:
- Claude Code:
curl -fsSL https://claude.ai/install.sh | sh - OpenAI Codex CLI:
npm i -g @openai/codex - Gemini CLI:
npm i -g @google/gemini-cli - Cursor: download from
https://cursor.com/
- Claude Code:
- A project directory where you do DGX Station work
Ancillary files
assets/AGENTS.md— canonical context file with critical constraints, GPU targeting, software versions, and common pitfalls. Cross-harness standard.assets/skills/vllm-setup/SKILL.md— skill: deploy vLLM with validated configuration.assets/skills/sglang-setup/SKILL.md— skill: deploy SGLang with validated configuration.assets/skills/mig-configure/SKILL.md— skill: configure MIG partitions on the GB300.assets/skills/dgx-diagnose/SKILL.md— skill: troubleshoot common DGX Station issues.assets/install.sh— per-harness installer (claude,codex,gemini,cursor, orall).
Time & risk
- Duration: 10-15 minutes
- Risk level: Low — this playbook copies markdown files into your project directory
- Rollback: Delete the context file (
AGENTS.md/CLAUDE.md/GEMINI.md) and the harness-specific skill directory (.claude/skills/,.codex/prompts/,.gemini/commands/, or.cursor/rules/) from your project directory - Last Updated: 05/18/2026
- Restructured as harness-agnostic Agent Skills (Claude Code, Codex, Gemini CLI, Cursor)
Instructions
Step 1. Install your coding agent
Pick whichever agent you prefer — the rest of this playbook works the same regardless. Install commands:
| Agent | Install |
|---|---|
| Claude Code | curl -fsSL https://claude.ai/install.sh | sh |
| OpenAI Codex CLI | npm i -g @openai/codex |
| Gemini CLI | npm i -g @google/gemini-cli |
| Cursor | Download from https://cursor.com/ |
Verify with claude --version, codex --version, gemini --version, or by launching Cursor.
Step 2. Install the skills into your project
Navigate to the project where you want DGX Station expertise, then run the installer with the harness you use:
cd ~/your-project
## Pick one:
/path/to/this/playbook/assets/install.sh claude
/path/to/this/playbook/assets/install.sh codex
/path/to/this/playbook/assets/install.sh gemini
/path/to/this/playbook/assets/install.sh cursor
## Or install for all four at once:
/path/to/this/playbook/assets/install.sh all
If you downloaded the playbook as a zip, the path is relative to the extracted directory:
station-ai-skills/assets/install.sh claude ~/your-project
The installer is additive for skill directories (won't clobber existing skills you've written) and refuses to overwrite an existing context file (AGENTS.md, CLAUDE.md, GEMINI.md) unless you pass --force.
Resulting layout (per harness):
your-project/
AGENTS.md or CLAUDE.md or GEMINI.md # context file (named for your agent)
.claude/skills/<name>/SKILL.md # claude
.codex/prompts/<name>.md # codex
.gemini/commands/<name>.md # gemini
.cursor/rules/<name>.mdc # cursor
Where <name> is each of vllm-setup, sglang-setup, mig-configure, dgx-diagnose.
Note
Every supported agent automatically reads the context file from the working directory at startup. Skills/prompts/rules in the harness-specific directory are discovered automatically — no additional configuration needed.
Step 3. Verify the setup
Start your agent in the project directory and ask a question that requires constraint knowledge:
Can I use --gpus all to run my CUDA workload on DGX Station?
The agent should immediately warn about the mixed-coherency constraint and recommend --gpus '"device=N"' targeting. If you don't get the warning, the context file isn't being loaded — see Troubleshooting.
Then verify the skills are discoverable:
| Agent | How to check |
|---|---|
| Claude Code | Type / — vllm-setup, sglang-setup, mig-configure, dgx-diagnose should appear in the autocomplete |
| Codex CLI | Type /prompts: — same four names appear |
| Gemini CLI | Type / — same four names appear |
| Cursor | Open the Rules panel — same four rules appear |
Step 4. Use vllm-setup to deploy an inference server
Invoke the skill in your agent:
| Agent | Invocation |
|---|---|
| Claude Code | /vllm-setup (slash command) or just describe the task ("deploy vllm with Qwen3-8B") |
| Codex CLI | /prompts:vllm-setup |
| Gemini CLI | /vllm-setup |
| Cursor | In chat: "use the vllm-setup rule to deploy a vllm server" |
The agent will walk you through deploying a vLLM server with a validated container image, correct GPU targeting, and recommended parameters. It will check your GPU index, ask which model you want to serve, and generate the full docker run command.
Step 5. Use sglang-setup to deploy SGLang
Same invocation pattern, but for SGLang with the cu130 container, RadixAttention prefix caching, and structured JSON output support.
Step 6. Use mig-configure to partition the GB300
The agent will query your current MIG state, show available profiles, help you choose a layout for your workloads, and execute the partitioning commands.
Step 7. Use dgx-diagnose to troubleshoot issues
If you encounter problems, invoke dgx-diagnose. The agent will check GPU status, driver version, running processes, MIG state, and Fabric Manager to identify the issue.
Step 8. Customize
Both the AGENTS.md and the skills are plain markdown — extend them freely.
Add project-specific constraints to AGENTS.md (or your harness-specific context file):
### Project-specific
- Our production MIG layout is 3g.139gb + 2g.70gb + 2g.70gb
- Always use port 8080 for inference (nginx proxy on 443)
- Model weights are cached at /data/models, mount with -v /data/models:/root/.cache/huggingface/hub
Create new skills by adding a directory and SKILL.md to assets/skills/, then re-run install.sh:
mkdir -p assets/skills/run-benchmarks
cat > assets/skills/run-benchmarks/SKILL.md << 'EOF'
---
name: run-benchmarks
description: Run our standard inference benchmark suite against the running vLLM or SGLang server and compare against the baseline.
---
## Run benchmarks
1. Check which inference server is running (vLLM on port 8000 or SGLang on port 30000)
2. Run the appropriate benchmark script from ./benchmarks/
3. Report throughput (tokens/sec), latency (TTFT, ITL), and memory utilization
4. Compare against the baseline in ./benchmarks/baseline.json
EOF
Tip
Keep
AGENTS.mdfocused on constraints and pitfalls (things that break). Put procedural workflows in skills (things you do step-by-step).
Troubleshooting
Skills don't appear in autocomplete / aren't discoverable
Each agent discovers skills from a harness-specific directory in the current directory (or a parent). Check the right one:
| Agent | Expected location |
|---|---|
| Claude Code | .claude/skills/<name>/SKILL.md |
| Codex CLI | .codex/prompts/<name>.md |
| Gemini CLI | .gemini/commands/<name>.md |
| Cursor | .cursor/rules/<name>.mdc |
## Examples — check the directory for your agent
ls -la .claude/skills/
ls -la .codex/prompts/
ls -la .gemini/commands/
ls -la .cursor/rules/
You should see entries for vllm-setup, sglang-setup, mig-configure, and dgx-diagnose.
Check you're in the right directory:
pwd
The agent must be started from the directory containing the harness directory, or a subdirectory of it.
Context file not loaded
If the agent gives generic answers without DGX Station awareness, the context file isn't being picked up. Each agent reads a different filename — verify the one for your agent exists:
| Agent | Expected filename |
|---|---|
| Claude Code | CLAUDE.md (also reads AGENTS.md as fallback) |
| Codex CLI | AGENTS.md |
| Gemini CLI | GEMINI.md |
| Cursor | AGENTS.md |
## Verify the file exists for your agent
cat AGENTS.md | head -5
cat CLAUDE.md | head -5
cat GEMINI.md | head -5
## Restart the agent in the correct directory
cd ~/your-project
claude # or codex, gemini, etc.
All four agents read the context file from the working directory (and parent directories up to the project root).
Skill gives outdated information
The skills contain validated container versions and parameters as of the publication date. If a newer container is available, edit the canonical source and re-install:
nano /path/to/playbook/assets/skills/vllm-setup/SKILL.md
/path/to/playbook/assets/install.sh all --force
Or edit the installed copy directly:
## Claude Code
nano .claude/skills/vllm-setup/SKILL.md
## Codex
nano .codex/prompts/vllm-setup.md
## Gemini CLI
nano .gemini/commands/vllm-setup.md
## Cursor
nano .cursor/rules/vllm-setup.mdc
Tip
Skills are plain markdown — you can version them in git alongside your project code.
"Both GPUs cannot be used" errors
This is the mixed-coherency constraint working as intended. If you see CUDA errors when using --gpus all:
## Find the GB300 index
nvidia-smi --query-gpu=index,name --format=csv,noheader
## Use device-specific targeting
docker run --gpus '"device=1"' ...
The AGENTS.md covers this constraint, but if you removed that section, add it back — it's the most important piece of DGX Station knowledge.
Skills conflict with existing project directory
If your project already has a .claude/, .codex/, .gemini/, or .cursor/ directory with its own contents, install.sh is additive for skill directories — it adds the new skill files alongside whatever you already have and warns on collision rather than overwriting.
For context files (AGENTS.md, CLAUDE.md, GEMINI.md), the installer refuses to overwrite an existing file. Pass --force to override, or merge the new content manually:
## See what would be written
diff /path/to/playbook/assets/AGENTS.md ./AGENTS.md
## Force overwrite
/path/to/playbook/assets/install.sh claude . --force
Installer reports "WROTE" for some files but "SKIP" for others
That's the safe-by-default behavior. The installer skips any file that already exists, prints a warning, and continues with the rest. To get a clean install, either:
- Delete the existing files first:
rm -rf .claude/skills/{vllm-setup,sglang-setup,mig-configure,dgx-diagnose} - Or pass
--force(only affects context files; skill files are still skipped if present)