**Multi-InstanceGPU (MIG)** lets you partition a single NVIDIA B300 GPU on your DGX Station (GB300 Ultra) into multiple smaller GPU instances. Each instance has dedicated memory and compute, so you can run multiple workloads or users on one physical GPU without sharing memory. This playbook walks you through enabling MIG, creating a B300 MIG layout, and using the instances from bare-metal apps or containers.
MIG is controlled via `nvidia-smi`:you enable MIG mode, then create GPU and compute instances using B300 profile IDs (e.g. 1g.34gb, 2g.67gb, 7g.269gb). When you no longer need partitioning, you disable MIG to restore full-GPU and NVLink P2P.
# What you'll accomplish
You will have MIG enabled and configured on your DGX Station B300 GPUs and know how to use the instances.
- **EnableMIG** on all B300 GPUs or on a per-GPU basis.
- **Createa MIG layout** using B300 profile IDs (with a known-good example for multiple GPUs).
- Root or sudo access to run `nvidia-smi -mig 1`, `-mig 0`, and `nvidia-smi mig -cgi ... -C`
- For containers/K8s:nvidia-container-toolkit and MIG support as described in the [MIG User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/)
# Ancillary files
This playbook does not use repository assets; all steps use `nvidia-smi` and MIG commands on the DGX Station. For container and Kubernetes setup, use the official [MIG User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/) (Getting Started with MIG and Kubernetes sections).
# Time & risk
- **Estimatedtime:** About 15 minutes to enable MIG, create a layout, and verify. Layout design (which profiles per GPU) may take longer if you customize.
- **Risklevel:** Low to Medium
- Enabling or disabling MIG requires sudo and affects all workloads on that GPU.
- Disabling MIG removes all MIG instances; ensure Fabric Manager is running on DGX/HGX B200/B300 so NVLink/NVSwitch re-initialize correctly.
Ensure your DGX Station is running with B300 GPUs (GB300 Ultra) and that the NVIDIA driver and `nvidia-smi` are available. You need root or sudo to enable MIG and create instances.
Example layout for a 6-GPU DGX Station (adjust GPU indices and counts to match your system). Each GPU can have any combination of profiles that fits within its capacity:
You can choose any valid combination of profile IDs per GPU that fits within the B300’s capacity; the above is a known-good example. If your DGX Station has fewer than 6 GPUs, run only the `-i <N>` commands for GPUs that exist (e.g. 0 and 1 only).
**Containersand Kubernetes:** use the NVIDIA MIG User Guide “Getting Started with MIG” and the Kubernetes sections. They cover the nvidia-container-toolkit, device plugin, and nvidia-mig-manager workflows for exposing MIG instances to containers.
> This removes all MIG instances and returns each B300 to a single full-GPU instance. Any workloads using MIG UUIDs will need to be reconfigured or restarted.
This resets the GPUs. On DGX/HGX B200/B300, ensure **Fabric Manager** is running so that NVLinks and NVSwitch fabric routing are re-initialized after MIG is disabled.
| `nvidia-smi -mig 1` fails or "MIG mode not supported" | Driver too old or GPU not MIG-capable | Ensure you have a B300 (or other MIG-capable GPU) and a driver version that supports MIG on B300. Check `nvidia-smi -q` and [MIG User Guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/) for supported hardware/driver. |
| `nvidia-smi mig -cgi ... -C -i N` fails (e.g. "Invalid combination") | Profile combination exceeds GPU capacity or invalid IDs | Run `nvidia-smi mig -lgip -i N` and use only listed profile IDs. Ensure the sum of instance sizes does not exceed the B300’s capacity for that GPU. |
| MIG instances not visible after creation | Instances not created or wrong GPU index | Run `nvidia-smi -L` and `nvidia-smi mig -lgi` to confirm. Re-run the `-cgi` commands for the correct `-i <gpu_index>`. |
| App doesn’t see MIG device when using CUDA_VISIBLE_DEVICES=MIG-<uuid> | Wrong UUID or app not using CUDA_VISIBLE_DEVICES | Get UUIDs from `nvidia-smi -L`. Export `CUDA_VISIBLE_DEVICES=MIG-<uuid>` in the same shell before launching the app. |
| After `nvidia-smi -mig 0`, NVLink or fabric issues on DGX/HGX | Fabric Manager not re-initializing | On DGX/HGX B200/B300, ensure Fabric Manager is running after disabling MIG so NVLinks and NVSwitch fabric are re-initialized. |
| Permission denied when running nvidia-smi -mig or mig -cgi | Need root for MIG operations | Use `sudo` for `nvidia-smi -mig 1/0` and `nvidia-smi mig -cgi ... -C`. |