mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-22 18:13:52 +00:00
Update vLLM container version to 25.11-py3
This commit is contained in:
parent
373591c46f
commit
be2db4604f
@ -57,9 +57,9 @@ support for ARM64.
|
||||
|
||||
## Step 1. Pull vLLM container image
|
||||
|
||||
Find the latest container build from https://catalog.ngc.nvidia.com/orgs/nvidia/containers/vllm?version=25.09-py3
|
||||
Find the latest container build from https://catalog.ngc.nvidia.com/orgs/nvidia/containers/vllm?version=25.11-py3
|
||||
```
|
||||
docker pull nvcr.io/nvidia/vllm:25.09-py3
|
||||
docker pull nvcr.io/nvidia/vllm:25.11-py3
|
||||
```
|
||||
|
||||
## Step 2. Test vLLM in container
|
||||
@ -68,7 +68,7 @@ Launch the container and start vLLM server with a test model to verify basic fun
|
||||
|
||||
```bash
|
||||
docker run -it --gpus all -p 8000:8000 \
|
||||
nvcr.io/nvidia/vllm:25.09-py3 \
|
||||
nvcr.io/nvidia/vllm:25.11-py3 \
|
||||
vllm serve "Qwen/Qwen2.5-Math-1.5B-Instruct"
|
||||
```
|
||||
|
||||
@ -96,7 +96,7 @@ Expected response should contain `"content": "204"` or similar mathematical calc
|
||||
For container approach (non-destructive):
|
||||
|
||||
```bash
|
||||
docker rm $(docker ps -aq --filter ancestor=nvcr.io/nvidia/vllm:25.09-py3)
|
||||
docker rm $(docker ps -aq --filter ancestor=nvcr.io/nvidia/vllm:25.11-py3)
|
||||
docker rmi nvcr.io/nvidia/vllm
|
||||
```
|
||||
|
||||
@ -150,8 +150,8 @@ After this, you should be able to run docker commands without using `sudo`.
|
||||
|
||||
|
||||
```bash
|
||||
docker pull nvcr.io/nvidia/vllm:25.09-py3
|
||||
export VLLM_IMAGE=nvcr.io/nvidia/vllm:25.09-py3
|
||||
docker pull nvcr.io/nvidia/vllm:25.11-py3
|
||||
export VLLM_IMAGE=nvcr.io/nvidia/vllm:25.11-py3
|
||||
```
|
||||
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user