mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-26 03:43:52 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
32fb5fe2cf
commit
3455359d65
@ -290,6 +290,7 @@ for medical image analysis and reasoning tasks.
|
|||||||
|---------|-------|-----|
|
|---------|-------|-----|
|
||||||
| VLLM container fails to start | Insufficient GPU memory | Reduce `--gpu-memory-utilization` to 0.25 |
|
| VLLM container fails to start | Insufficient GPU memory | Reduce `--gpu-memory-utilization` to 0.25 |
|
||||||
| Model download fails | Network connectivity or HF auth | Check `huggingface-cli whoami` and internet |
|
| Model download fails | Network connectivity or HF auth | Check `huggingface-cli whoami` and internet |
|
||||||
|
| Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your HuggingFace token; and request access to the gated model on your web browser |
|
||||||
| Open WebUI shows connection error | Wrong backend URL | Verify `OPENAI_API_BASE_URL` is set correctly |
|
| Open WebUI shows connection error | Wrong backend URL | Verify `OPENAI_API_BASE_URL` is set correctly |
|
||||||
| Model doesn't show full reasoning | Reasoning tags enabled | Disable "Reasoning Tags" in Chat Controls → Advanced Params |
|
| Model doesn't show full reasoning | Reasoning tags enabled | Disable "Reasoning Tags" in Chat Controls → Advanced Params |
|
||||||
|
|
||||||
|
|||||||
@ -140,6 +140,10 @@ docker volume rm "$(basename "$PWD")_postgres_data"
|
|||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
|
| Symptom | Cause | Fix |
|
||||||
|
|---------|--------|-----|
|
||||||
|
| Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your HuggingFace token; and request access to the gated model on your web browser |
|
||||||
|
|
||||||
> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
|
> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
|
||||||
> With many applications still updating to take advantage of UMA, you may encounter memory issues even when within
|
> With many applications still updating to take advantage of UMA, you may encounter memory issues even when within
|
||||||
> the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
|
> the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
|
||||||
|
|||||||
@ -163,6 +163,7 @@ docker stop <container_id>
|
|||||||
| "CUDA out of memory" error | Insufficient GPU memory | Reduce `kv_cache_free_gpu_memory_fraction` to 0.9 or use a device with more VRAM |
|
| "CUDA out of memory" error | Insufficient GPU memory | Reduce `kv_cache_free_gpu_memory_fraction` to 0.9 or use a device with more VRAM |
|
||||||
| Container fails to start | Docker GPU support issues | Verify `nvidia-docker` is installed and `--gpus=all` flag is supported |
|
| Container fails to start | Docker GPU support issues | Verify `nvidia-docker` is installed and `--gpus=all` flag is supported |
|
||||||
| Model download fails | Network or authentication issues | Check HuggingFace authentication and network connectivity |
|
| Model download fails | Network or authentication issues | Check HuggingFace authentication and network connectivity |
|
||||||
|
| Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your HuggingFace token; and request access to the gated model on your web browser |
|
||||||
| Server doesn't respond | Port conflicts or firewall | Check if port 8000 is available and not blocked |
|
| Server doesn't respond | Port conflicts or firewall | Check if port 8000 is available and not blocked |
|
||||||
|
|
||||||
> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
|
> **Note:** DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
|
||||||
|
|||||||
@ -383,6 +383,7 @@ http://192.168.100.10:8265
|
|||||||
|---------|--------|-----|
|
|---------|--------|-----|
|
||||||
| Node 2 not visible in Ray cluster | Network connectivity issue | Verify QSFP cable connection, check IP configuration |
|
| Node 2 not visible in Ray cluster | Network connectivity issue | Verify QSFP cable connection, check IP configuration |
|
||||||
| Model download fails | Authentication or network issue | Re-run `huggingface-cli login`, check internet access |
|
| Model download fails | Authentication or network issue | Re-run `huggingface-cli login`, check internet access |
|
||||||
|
| Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your HuggingFace token; and request access to the gated model on your web browser |
|
||||||
| CUDA out of memory with 405B | Insufficient GPU memory | Use 70B model or reduce max_model_len parameter |
|
| CUDA out of memory with 405B | Insufficient GPU memory | Use 70B model or reduce max_model_len parameter |
|
||||||
| Container startup fails | Missing ARM64 image | Rebuild vLLM image following ARM64 instructions |
|
| Container startup fails | Missing ARM64 image | Rebuild vLLM image following ARM64 instructions |
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user