# MIG Configuration on DGX Station Configure MIG (Multi-Instance GPU) partitions on the DGX Station GB300. ## Steps 1. **Find the GB300 GPU index.** Run: ```bash nvidia-smi --query-gpu=index,name --format=csv,noheader ``` 2. **Check current MIG state:** ```bash nvidia-smi -i -q | grep -i "MIG Mode" ``` 3. **If MIG is already enabled, show current instances:** ```bash nvidia-smi mig -lgi -i nvidia-smi mig -lci -i ``` If the user wants to reconfigure, destroy existing instances first (step 6). 4. **If MIG is not enabled, enable it.** All GPU processes must be stopped first: ```bash # Check for running GPU processes sudo fuser -v /dev/nvidia* # Enable MIG sudo nvidia-smi -i -mig 1 # Verify nvidia-smi -i -q | grep -i "MIG Mode" ``` 5. **Show available profiles and help the user choose a layout:** ```bash nvidia-smi mig -lgip -i ``` Common GB300 MIG profiles: | Profile | ID | Memory | Use case | |---------|----|--------|----------| | 1g.35gb | 19 | ~35 GB | Small models (7-8B), dev/test | | 1g.35gb+me | 20 | ~35 GB | Same + media extensions | | 1g.70gb | 15 | ~70 GB | Slightly larger inference | | 2g.70gb | 14 | ~70 GB | Medium models (14-30B) | | 3g.139gb | 9 | ~139 GB | Large models (70B quantized) | | 4g.139gb | 5 | ~139 GB | Large models, more compute | | 7g.278gb | 0 | ~278 GB | Full GPU as single instance | Suggest layouts based on the user's workload. Examples: - **Two models (70B + 8B):** `3g.139gb + 2g.70gb + 2g.70gb` → IDs `9,14,14` - **Many small models:** `7 × 1g.35gb` → IDs `19,19,19,19,19,19,19` - **One large model with isolation:** `7g.278gb` → ID `0` Ask the user what models they want to run before suggesting a layout. 6. **Create (or recreate) instances:** If reconfiguring, destroy existing instances first: ```bash sudo nvidia-smi mig -dci -i sudo nvidia-smi mig -dgi -i ``` Then create the new layout: ```bash sudo nvidia-smi mig -cgi -C -i ``` 7. **Get the MIG device UUIDs:** ```bash nvidia-smi -L ``` Note the `MIG-` entries — these are used to target specific MIG instances. 8. **Show the user how to use MIG devices:** ```bash # Bare metal export CUDA_VISIBLE_DEVICES=MIG- # Docker docker run --gpus '"device=MIG-"' ... ``` 9. **Report the final layout** to the user with UUIDs and suggested docker commands for each instance. ## Disabling MIG If the user wants to return to full-GPU mode: ```bash # Stop all workloads using MIG instances first sudo nvidia-smi mig -dci -i sudo nvidia-smi mig -dgi -i sudo nvidia-smi -i -mig 0 # Ensure Fabric Manager is running for NVLink re-initialization sudo systemctl start nvidia-fabricmanager ```