mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-22 01:53:53 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
63975362f1
commit
bf842ce358
@ -33,7 +33,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
|
||||
- [Multi-modal Inference](nvidia/multi-modal-inference/)
|
||||
- [NCCL for Two Sparks](nvidia/nccl/)
|
||||
- [Fine-tune with NeMo](nvidia/nemo-fine-tune/)
|
||||
- [Use a NIM on Spark](nvidia/nim-llm/)
|
||||
- [NIM on Spark](nvidia/nim-llm/)
|
||||
- [NVFP4 Quantization](nvidia/nvfp4-quantization/)
|
||||
- [Ollama](nvidia/ollama/)
|
||||
- [Open WebUI with Ollama](nvidia/open-webui/)
|
||||
|
||||
1000
nvidia/cuda-x-data-science/assets/cudf_pandas_demo.ipynb
Normal file
1000
nvidia/cuda-x-data-science/assets/cudf_pandas_demo.ipynb
Normal file
File diff suppressed because it is too large
Load Diff
2395
nvidia/cuda-x-data-science/assets/cuml_sklearn_demo.ipynb
Normal file
2395
nvidia/cuda-x-data-science/assets/cuml_sklearn_demo.ipynb
Normal file
File diff suppressed because one or more lines are too long
@ -1,6 +1,6 @@
|
||||
# Use a NIM on Spark
|
||||
# NIM on Spark
|
||||
|
||||
> Run an LLM NIM on Spark
|
||||
> Deploy a NIM on Spark
|
||||
|
||||
## Table of Contents
|
||||
|
||||
@ -19,17 +19,11 @@
|
||||
|
||||
### Basic idea
|
||||
|
||||
NVIDIA Inference Microservices (NIMs) provide optimized containers for deploying large language
|
||||
models with simplified APIs. This playbook demonstrates how to run LLM NIMs on DGX Spark devices,
|
||||
enabling GPU-accelerated inference through Docker containers. You'll set up authentication with
|
||||
NVIDIA's registry, launch a containerized LLM service, and perform basic inference testing to
|
||||
verify functionality.
|
||||
NVIDIA NIM is containerized software for fast, reliable AI model serving and inference on NVIDIA GPUs. This playbook demonstrates how to run NIM microservices for LLMs on DGX Spark devices, enabling local GPU inference through a simple Docker workflow. You'll authenticate with NVIDIA's registry, launch the NIM inference microservice, and perform basic inference testing to verify functionality.
|
||||
|
||||
### What you'll accomplish
|
||||
|
||||
You'll deploy an LLM NIM container on your DGX Spark device, configure it for GPU acceleration,
|
||||
and establish a working inference endpoint that responds to HTTP API calls with generated text
|
||||
completions.
|
||||
You'll launch a NIM container on your DGX Spark device to expose a GPU-accelerated HTTP endpoint for text completions.
|
||||
|
||||
### What to know before starting
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user