chore: Regenerate all playbooks

This commit is contained in:
GitLab CI 2026-01-21 17:01:21 +00:00
parent c6467ceb5d
commit b678b0b25e
3 changed files with 1582 additions and 1458 deletions

View File

@ -81,8 +81,8 @@ All required assets can be found [in the Portfolio Optimization repository](http
* **Rollback:** Stop the Docker container and remove the cloned repository to fully remove the installation.
* **Last Updated:** 01/02/2026
* First Publication
* **Last Updated:** 01/21/2026
* Update `git clone` command with the correct project path.
## Instructions
@ -104,7 +104,7 @@ docker --version
Open up Terminal, then copy and paste in the below commands:
```bash
git clone https://github.com/NVIDIA/dgx-spark-playbooks/nvidia/portfolio-optimization
git clone https://github.com/NVIDIA/dgx-spark-playbooks
cd dgx-spark-playbooks/nvidia/portfolio-optimization/assets
bash ./setup/start_playbook.sh
```

File diff suppressed because one or more lines are too long

View File

@ -84,9 +84,8 @@ Reminder: not all model architectures are supported for NVFP4 quantization.
* **Duration:** 30 minutes for Docker approach
* **Risks:** Container registry access requires internal credentials
* **Rollback:** Container approach is non-destructive.
* **Last Updated:** 01/02/2026
* Add supported Model Matrix (25.11-py3)
* Improve cluster setup instructions
* **Last Updated:** 01/21/2026
* Update Llama-3.1-405B inference server command to avoid Out-of-Memory errors.
## Instructions
@ -351,8 +350,8 @@ Start the server with memory-constrained parameters for the large model.
export VLLM_CONTAINER=$(docker ps --format '{{.Names}}' | grep -E '^node-[0-9]+$')
docker exec -it $VLLM_CONTAINER /bin/bash -c '
vllm serve hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4 \
--tensor-parallel-size 2 --max-model-len 256 --gpu-memory-utilization 1.0 \
--max-num-seqs 1 --max_num_batched_tokens 256'
--tensor-parallel-size 2 --max-model-len 64 --gpu-memory-utilization 0.9 \
--max-num-seqs 1 --max_num_batched_tokens 64'
```
## Step 12. (Optional) Test 405B model inference