| .. | ||
| README.md | ||
Fine tune with Pytorch
Use Pytorch to fine-tune models locally
Table of Contents
Overview
Basic Idea
This playbook guides you through setting up and using Pytorch for fine-tuning large language models on NVIDIA Spark devices.
What you'll accomplish
You'll establish a complete fine-tuning environment for large language models (1-70B parameters) on your NVIDIA Spark device. By the end, you'll have a working installation that supports parameter-efficient fine-tuning (PEFT) and supervised fine-tuning (SFT)
What to know before starting
Prerequisites
recipes are specifically for DIGITS SPARK. Please make sure that OS and drivers are latest.
Ancillary files
ALl files required for finetuning are included.
Time & risk
Time estimate: 30-45 mins for setup and runing finetuning. Finetuning run time varies depending on model size
Risks: Model downloads can be large (several GB), ARM64 package compatibility issues may require troubleshooting.
Rollback:
Instructions
Step 1. Configure Docker permissions
To easily manage containers without sudo, you must be in the docker group. If you choose to skip this step, you will need to run Docker commands with sudo.
Open a new terminal and test Docker access. In the terminal, run:
docker ps
If you see a permission denied error (something like permission denied while trying to connect to the Docker daemon socket), add your user to the docker group:
sudo usermod -aG docker $USER
Warning
: After running usermod, you must log out and log back in to start a new session with updated group permissions.
Step 2. Pull the latest Pytorch container
docker pull nvcr.io/nvidia/pytorch:25.09-py3
Step 3. Launch Docker
docker run --gpus all -it --rm --ipc=host \
-v $HOME/.cache/huggingface:/root/.cache/huggingface \
-v ${PWD}:/workspace -w /workspace \
nvcr.io/nvidia/pytorch:25.09-py3
-v $HOME/.cache/huggingface:/root/.cache/huggingface \
-v ${PWD}:/workspace -w /workspace \
nvcr.io/nvidia/pytorch:25.09-py3
Step 4. Install dependencies inside the contianer
pip install transformers peft datasets "trl==0.19.1" "bitsandbytes==0.48"
Step 5: authenticate with huggingface
huggingface-cli login
##<input your huggingface token.
##<Enter n for git credential>
Step6: Clone the git repo with finetuning recipes
git clone https://gitlab.com/nvidia/dgx-spark/temp-external-playbook-assets/dgx-spark-playbook-assets/-/blob/main/${MODEL}
cd ${MODEL}/assets
##Step7: Run the finetuning recipes
To run LoRA on Llama3-8B use the following command:
python Llama3_8B_LoRA_finetuning.py
To run qLoRA finetuning on llama3-70B use the following command:
python Llama3_70B_qLoRA_finetuning.py
To run full finetuning on llama3-3B use the following command:
python Llama3_3B_full_finetuning.py