chore: Regenerate all playbooks

2026-06-20 21:29:31 +00:00 · 2025-10-17 00:58:35 +00:00 · 2025-10-17 00:58:35 +00:00 · 3ed5b3b073
commit 3ed5b3b073
parent 0d9108cf14
7 changed files with 3534 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -24,6 +24,7 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
 - [Comfy UI](nvidia/comfy-ui/)
 - [Set Up Local Network Access](nvidia/connect-to-your-spark/)
 - [Connect Two Sparks](nvidia/connect-two-sparks/)
 - [CUDA-X Data Science](nvidia/cuda-x-data-science/)
 - [DGX Dashboard](nvidia/dgx-dashboard/)
 - [FLUX.1 Dreambooth LoRA Fine-tuning](nvidia/flux-finetuning/)
 - [Optimized JAX](nvidia/jax/)
@ -43,10 +44,11 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
 - [TRT LLM for Inference](nvidia/trt-llm/)
 - [Text to Knowledge Graph](nvidia/txt2kg/)
 - [Unsloth on DGX Spark](nvidia/unsloth/)
 - [Vibe Coding in VS Code](nvidia/vibe-coding/)
 - [Install and Use vLLM for Inference](nvidia/vllm/)
 - [Vision-Language Model Fine-tuning](nvidia/vlm-finetuning/)
 - [VS Code](nvidia/vscode/)
- [Video Search and Summarization](nvidia/vss/)
+- [Build a Video Search and Summarization (VSS) Agent](nvidia/vss/)
 ## Resources
--- a/nvidia/cuda-x-data-science/README.md
+++ b/nvidia/cuda-x-data-science/README.md
@ -0,0 +1,82 @@
 # CUDA-X Data Science
 > Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes
 ## Table of Contents
 - [Overview](#overview)
 - [Instructions](#instructions)
 ---
 ## Overview
 ## Basic idea
 This playbook includes two example notebooks that demonstrate the acceleration of key machine learning algorithms and core pandas operations using CUDA-X Data Science libraries:
 - **NVIDIA cuDF:** Accelerates operations for data preparation and core data processing of 8GB of strings data, with no code changes.
 - **NVIDIA cuML:** Accelerates popular, compute intensive machine learning algorithms in sci-kit learn (LinearSVC), UMAP, and HDBSCAN, with no code changes.
 CUDA-X Data Science (formally RAPIDS) is an open-source library collection that accelerates the data science and data processing ecosystem. These libraries accelerate popular Python tools like scikit-learn and pandas with zero code changes. On DGX Spark, these libraries maximize performance at your desk with your existing code.
 ## What you'll accomplish
 You will accelerate popular machine learning algorithms and data analytics operations GPU. You will understand how to accelerate popular Python tools, and the value of running data science workflows on your DGX Spark. 
 ## Prerequisites
 - Familiarity with pandas, scikit-learn, machine learning algorithms, such as support vector machine, clustering, and dimensionality reduction algorithms.
 - Install conda
 - Generate a Kaggle API key
 ## Time & risk
 * **Duration:** 20-30 minutes setup time and 2-3 minutes to run each notebook. 
 * **Risk level:** 
  * Data download slowness or failure due to network issues
  * Kaggle API generation failure requiring retries
 * **Rollback:** No permanent system changes made during normal usage.
 ## Instructions
 ## Step 1. Verify system requirements
 - Verify the system has CUDA 13 installed using `nvcc --version` or `nvidia-smi` 
 - Install conda using [these instructions](https://docs.anaconda.com/miniconda/install/)
 - Create Kaggle API key using [these instructions](https://www.kaggle.com/discussions/general/74235) and place the **kaggle.json** file in the same folder as the notebook
 ## Step 2. Installing Data Science libraries
 - Use the following command to install the CUDA-X libraries (this will create a new conda environment)
  ```bash
    conda create -n rapids-test -c rapidsai-nightly -c conda-forge -c nvidia  \
    rapids=25.10 python=3.12 'cuda-version=13.0' \
    jupyter hdbscan umap-learn
  ```
 ## Step 3. Activate the conda environment
 - Activate the conda environment
  ```bash
    conda activate rapids-test
  ```
 ## Step 4. Cloning the playbook repository
 - Clone the github repository and go the assets folder place in cuda-x-data-science folder
  ```bash
    git clone https://github.com/NVIDIA/dgx-spark-playbooks
  ```
 - Place the **kaggle.json** created in Step 1 in the assets folder
 ## Step 5. Run the notebooks
 There are two notebooks in the GitHub repository. 
 One runs an example of a large strings data processing workflow with pandas code on GPU.
 - Run the cudf_pandas_demo.ipynb notebook and use `localhost:8888` in your browser to access the notebook
  ```bash
    jupyter notebook cudf_pandas_demo.ipynb
  ```
 The other goes over an example of machine learning algorithms including UMAP and HDBSCAN.
 - Run the cuml_sklearn_demo.ipynb notebook and use `localhost:8888` in your browser to access the notebook
  ```bash
    jupyter notebook cuml_sklearn_demo.ipynb
  ```
 If you are remotely accessing your DGX-Spark then make sure to forward the necesary port to access the notebook in your local browser. Use the below   instruction for port fowarding
 ```bash
   ssh -N -L YYYY:localhost:XXXX username@remote_host 
 ```
 - `YYYY`: The local port you want to use (e.g. 8888)
 - `XXXX`: The port you specified when starting Jupyter Notebook on the remote machine (e.g. 8888)
 - `-N`: Prevents SSH from executing a remote command
 - `-L`: Spcifies local port forwarding
--- a/nvidia/cuda-x-data-science/assets/cudf_pandas_demo.ipynb
+++ b/nvidia/cuda-x-data-science/assets/cudf_pandas_demo.ipynb
@ -0,0 +1,939 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "84635d55-68a2-468b-ac09-9029ebdab55f",
   "metadata": {
    "id": "84635d55-68a2-468b-ac09-9029ebdab55f"
   },
   "source": [
    "# Accelerating large string data processing with cudf pandas accelerator mode (cudf.pandas)\n",
    "<a href=\"https://github.com/rapidsai/cudf\">cuDF</a> is a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating, filtering, and otherwise manipulating tabular data using a DataFrame style API in the style of pandas.\n",
    "\n",
    "cuDF now provides a <a href=\"https://rapids.ai/cudf-pandas/\">pandas accelerator mode</a> (`cudf.pandas`), allowing you to bring accelerated computing to your pandas workflows without requiring any code change.\n",
    "\n",
    "This notebook demonstrates how cuDF pandas accelerator mode can help accelerate processing of datasets with large string fields (4 GB+) processing by simply adding a `%load_ext` command. We have introduced this feature as part of our Rapids 24.08 release.\n",
    "\n",
    "**Author:** Allison Ding, Mitesh Patel <br>\n",
    "**Date:** October 3, 2025"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bb8fe7ab-c055-40e9-897d-c62c72f28a16",
   "metadata": {
    "id": "bb8fe7ab-c055-40e9-897d-c62c72f28a16"
   },
   "source": [
    "# ⚠️ Verify your setup\n",
    "\n",
    "First, we'll verify that you are running with an NVIDIA GPU."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "a88b8586-cfdd-4d31-9b4d-9be8508f7ba0",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "a88b8586-cfdd-4d31-9b4d-9be8508f7ba0",
    "outputId": "18525b64-b34b-40e3-ed3a-1ad56ae794b5"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fri Oct  3 23:16:52 2025       \n",
      "+-----------------------------------------------------------------------------------------+\n",
      "| NVIDIA-SMI 580.82.09              Driver Version: 580.82.09      CUDA Version: 13.0     |\n",
      "+-----------------------------------------+------------------------+----------------------+\n",
      "| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |\n",
      "| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |\n",
      "|                                         |                        |               MIG M. |\n",
      "|=========================================+========================+======================|\n",
      "|   0  NVIDIA GB10                    Off |   0000000F:01:00.0 Off |                  N/A |\n",
      "| N/A   44C    P0             10W /  N/A  | Not Supported          |      0%      Default |\n",
      "|                                         |                        |                  N/A |\n",
      "+-----------------------------------------+------------------------+----------------------+\n",
      "\n",
      "+-----------------------------------------------------------------------------------------+\n",
      "| Processes:                                                                              |\n",
      "|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |\n",
      "|        ID   ID                                                               Usage      |\n",
      "|=========================================================================================|\n",
      "|    0   N/A  N/A            3405      G   /usr/lib/xorg/Xorg                      242MiB |\n",
      "|    0   N/A  N/A            3562      G   /usr/bin/gnome-shell                     53MiB |\n",
      "|    0   N/A  N/A          214921      C   .../envs/rapids-25.10/bin/python        196MiB |\n",
      "+-----------------------------------------------------------------------------------------+\n"
     ]
    }
   ],
   "source": [
    "!nvidia-smi  # this should display information about available GPUs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5cd58071-4371-428b-8a02-9cd66e6cb91f",
   "metadata": {
    "id": "5cd58071-4371-428b-8a02-9cd66e6cb91f"
   },
   "source": [
    "# Download the data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9eb67713-7cf4-415a-bce7-ff4695862faa",
   "metadata": {
    "id": "9eb67713-7cf4-415a-bce7-ff4695862faa"
   },
   "source": [
    "## Overview\n",
    "The data we'll be working with summarizes job postings data that a developer working at a job listing firm might analyze to understand posting trends.\n",
    "\n",
    "We'll need to download a curated copy of this [Kaggle dataset](https://www.kaggle.com/datasets/asaniczka/1-3m-linkedin-jobs-and-skills-2024/data?select=job_summary.csv) directly from the kaggle API.  \n",
    "\n",
    "**Data License and Terms** <br>\n",
    "As this dataset originates from a Kaggle dataset, it's governed by that dataset's license and terms of use, which is the Open Data Commons license. Review here:https://opendatacommons.org/licenses/by/1-0/index.html. For each dataset an user elects to use, the user is responsible for checking if the dataset license is fit for the intended purpose.\n",
    "\n",
    "**Are there restrictions on how I can use this data? </br>**\n",
    "For each dataset an user elects to use, the user is responsible for checking if the dataset license is fit for the intended purpose.\n",
    "\n",
    "## Get the Data\n",
    "First, [please follow these instructions from Kaggle to download and/or updating your Kaggle API token to get acces the dataset](https://www.kaggle.com/discussions/general/74235).  \n",
    "\n",
    "Once generated, make sure to have the **kaggle.json** file in the same folder as the notebook\n",
    "\n",
    "Next, run this code below, which should also take 1-2 minutes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "406838c6-267c-423e-82ab-ea13d5fa9c90",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: kaggle in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (1.7.4.5)\n",
      "Requirement already satisfied: bleach in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (6.2.0)\n",
      "Requirement already satisfied: certifi>=14.05.14 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2025.8.3)\n",
      "Requirement already satisfied: charset-normalizer in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (3.4.3)\n",
      "Requirement already satisfied: idna in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (3.10)\n",
      "Requirement already satisfied: protobuf in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (6.32.1)\n",
      "Requirement already satisfied: python-dateutil>=2.5.3 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2.9.0.post0)\n",
      "Requirement already satisfied: python-slugify in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (8.0.4)\n",
      "Requirement already satisfied: requests in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2.32.5)\n",
      "Requirement already satisfied: setuptools>=21.0.0 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (80.9.0)\n",
      "Requirement already satisfied: six>=1.10 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (1.17.0)\n",
      "Requirement already satisfied: text-unidecode in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (1.3)\n",
      "Requirement already satisfied: tqdm in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (4.67.1)\n",
      "Requirement already satisfied: urllib3>=1.15.1 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2.5.0)\n",
      "Requirement already satisfied: webencodings in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (0.5.1)\n"
     ]
    }
   ],
   "source": [
    "!pip install kaggle\n",
    "!mkdir -p ~/.kaggle\n",
    "!cp kaggle.json ~/.kaggle/\n",
    "!chmod 600 ~/.kaggle/kaggle.json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "3efacb3c-5f3d-4ff0-b32a-76bbb80b5f74",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "3efacb3c-5f3d-4ff0-b32a-76bbb80b5f74",
    "outputId": "5fe4a878-cf57-44f9-e40e-ed413035b150"
   },
   "outputs": [],
   "source": [
    "# Download the dataset through kaggle API-\n",
    "!kaggle datasets download -d asaniczka/1-3m-linkedin-jobs-and-skills-2024\n",
    "#unzip the file to access contents\n",
    "!unzip 1-3m-linkedin-jobs-and-skills-2024.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2__ZMVe6LaBJ",
   "metadata": {
    "id": "2__ZMVe6LaBJ"
   },
   "source": [
    "# Analysis with cuDF Pandas"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df47f304-2b30-4380-afd5-0613b63d103d",
   "metadata": {},
   "source": [
    "The magic command `%load_ext cudf.pandas` enables GPU acceleration for pandas data processing in a Jupyter notebook, allowing most pandas operations to automatically execute on NVIDIA GPUs for improved performance. \n",
    "\n",
    "With this extension loaded before importing pandas, your code can use standard pandas syntax while gaining the benefits of GPU speedup, automatically falling back to CPU execution for operations not supported on the GPU. This provides a seamless way to accelerate existing pandas workflows with zero code changes, especially for large data analytics tasks or machine learning preprocessing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e5cd2520-30a6-41c1-b7c5-5abe0eb90d82",
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext cudf.pandas"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "eadb8d77-cb45-4c7c-ae9f-77e47a4f29b3",
   "metadata": {
    "id": "eadb8d77-cb45-4c7c-ae9f-77e47a4f29b3"
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "196268f2-6169-4ed7-a9e6-db9078caa6ab",
   "metadata": {
    "id": "196268f2-6169-4ed7-a9e6-db9078caa6ab"
   },
   "source": [
    "We'll run a piece of code to get a feel what GPU-acceleration brings to pandas workflows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ae3b6a16-ff72-4421-b43c-06c33f57ec12",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ae3b6a16-ff72-4421-b43c-06c33f57ec12",
    "outputId": "656acbf7-078f-42b3-832d-ad4e84e01c70"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 185 ms, sys: 2.08 s, total: 2.27 s\n",
      "Wall time: 2.95 s\n",
      "Dataset Size (in GB): 4.76\n"
     ]
    }
   ],
   "source": [
    "%%time \n",
    "job_summary_df = pd.read_csv(\"job_summary.csv\", dtype=('str'))\n",
    "print(\"Dataset Size (in GB):\",round(job_summary_df.memory_usage(\n",
    "    deep=True).sum()/(1024**3),2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01c506e1-f135-4afb-8fc7-23e72c05d73c",
   "metadata": {
    "id": "01c506e1-f135-4afb-8fc7-23e72c05d73c"
   },
   "source": [
    "The same dataset takes about around 1.5 minutes to load with pandas. That's around **5x speedup** with no changes to the code!"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9d0a0e1-1d74-494d-bd12-b829f11eeede",
   "metadata": {
    "id": "d9d0a0e1-1d74-494d-bd12-b829f11eeede"
   },
   "source": [
    "Let's load the remaining two datasets as well:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "12e4cf7e-8824-4822-9d30-46b81ba2acd7",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "12e4cf7e-8824-4822-9d30-46b81ba2acd7",
    "outputId": "5ca1be17-09e3-40ab-928b-82176bf597bf"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 45.3 ms, sys: 199 ms, total: 244 ms\n",
      "Wall time: 354 ms\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "job_skills_df = pd.read_csv(\"job_skills.csv\", dtype=('str'))\n",
    "job_postings_df = pd.read_csv(\"linkedin_job_postings.csv\", dtype=('str'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "13c8f9da-121f-4311-8a79-274425363e5e",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 276
    },
    "id": "13c8f9da-121f-4311-8a79-274425363e5e",
    "outputId": "a73599c1-05b2-4f56-a190-c69c017bb330"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 4.46 ms, sys: 3.1 ms, total: 7.56 ms\n",
      "Wall time: 46.3 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "0     957\n",
       "1    3816\n",
       "2    5314\n",
       "3    2774\n",
       "4    2749\n",
       "Name: summary_length, dtype: int32"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "job_summary_df['summary_length'] = job_summary_df['job_summary'].str.len()\n",
    "job_summary_df['summary_length'].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "67b68792-5c64-4ebd-9d80-cf6ff55baeef",
   "metadata": {
    "id": "67b68792-5c64-4ebd-9d80-cf6ff55baeef"
   },
   "source": [
    "That was lightning fast! We went from around 10+ (with pandas) to a few milliseconds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "31e1cc84-debb-4da7-bc20-5c7139f786f7",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 504
    },
    "id": "31e1cc84-debb-4da7-bc20-5c7139f786f7",
    "outputId": "2d89fc49-7e5b-41db-c25b-441d54480711"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 39.8 ms, sys: 30 ms, total: 69.8 ms\n",
      "Wall time: 211 ms\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>job_link</th>\n",
       "      <th>last_processed_time</th>\n",
       "      <th>got_summary</th>\n",
       "      <th>got_ner</th>\n",
       "      <th>is_being_worked</th>\n",
       "      <th>job_title</th>\n",
       "      <th>company</th>\n",
       "      <th>job_location</th>\n",
       "      <th>first_seen</th>\n",
       "      <th>search_city</th>\n",
       "      <th>search_country</th>\n",
       "      <th>search_position</th>\n",
       "      <th>job_level</th>\n",
       "      <th>job_type</th>\n",
       "      <th>job_summary</th>\n",
       "      <th>summary_length</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>https://www.linkedin.com/jobs/view/account-exe...</td>\n",
       "      <td>2024-01-21 07:12:29.00256+00</td>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "      <td>f</td>\n",
       "      <td>Account Executive - Dispensing (NorCal/Norther...</td>\n",
       "      <td>BD</td>\n",
       "      <td>San Diego, CA</td>\n",
       "      <td>2024-01-15</td>\n",
       "      <td>Coronado</td>\n",
       "      <td>United States</td>\n",
       "      <td>Color Maker</td>\n",
       "      <td>Mid senior</td>\n",
       "      <td>Onsite</td>\n",
       "      <td>Responsibilities\\nJob Description Summary\\nJob...</td>\n",
       "      <td>4602</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>https://www.linkedin.com/jobs/view/registered-...</td>\n",
       "      <td>2024-01-21 07:39:58.88137+00</td>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "      <td>f</td>\n",
       "      <td>Registered Nurse - RN Care Manager</td>\n",
       "      <td>Trinity Health MI</td>\n",
       "      <td>Norton Shores, MI</td>\n",
       "      <td>2024-01-14</td>\n",
       "      <td>Grand Haven</td>\n",
       "      <td>United States</td>\n",
       "      <td>Director Nursing Service</td>\n",
       "      <td>Mid senior</td>\n",
       "      <td>Onsite</td>\n",
       "      <td>Employment Type:\\nFull time\\nShift:\\nDescripti...</td>\n",
       "      <td>2950</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>https://www.linkedin.com/jobs/view/restaurant-...</td>\n",
       "      <td>2024-01-21 07:40:00.251126+00</td>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "      <td>f</td>\n",
       "      <td>RESTAURANT SUPERVISOR - THE FORKLIFT</td>\n",
       "      <td>Wasatch Adaptive Sports</td>\n",
       "      <td>Sandy, UT</td>\n",
       "      <td>2024-01-14</td>\n",
       "      <td>Tooele</td>\n",
       "      <td>United States</td>\n",
       "      <td>Stand-In</td>\n",
       "      <td>Mid senior</td>\n",
       "      <td>Onsite</td>\n",
       "      <td>Job Details\\nDescription\\nWhat You'll Do\\nAs a...</td>\n",
       "      <td>4571</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>https://www.linkedin.com/jobs/view/independent...</td>\n",
       "      <td>2024-01-21 07:40:00.308133+00</td>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "      <td>f</td>\n",
       "      <td>Independent Real Estate Agent</td>\n",
       "      <td>Howard Hanna | Rand Realty</td>\n",
       "      <td>Englewood Cliffs, NJ</td>\n",
       "      <td>2024-01-16</td>\n",
       "      <td>Pinehurst</td>\n",
       "      <td>United States</td>\n",
       "      <td>Real-Estate Clerk</td>\n",
       "      <td>Mid senior</td>\n",
       "      <td>Onsite</td>\n",
       "      <td>Who We Are\\nRand Realty is a family-owned brok...</td>\n",
       "      <td>3944</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>https://www.linkedin.com/jobs/view/group-unit-...</td>\n",
       "      <td>2024-01-19 09:45:09.215838+00</td>\n",
       "      <td>f</td>\n",
       "      <td>f</td>\n",
       "      <td>f</td>\n",
       "      <td>Group/Unit Supervisor (Systems Support Manager...</td>\n",
       "      <td>IRS, Office of Chief Counsel</td>\n",
       "      <td>Chamblee, GA</td>\n",
       "      <td>2024-01-17</td>\n",
       "      <td>Gadsden</td>\n",
       "      <td>United States</td>\n",
       "      <td>Supervisor Travel-Information Center</td>\n",
       "      <td>Mid senior</td>\n",
       "      <td>Onsite</td>\n",
       "      <td>None</td>\n",
       "      <td>&lt;NA&gt;</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                            job_link  \\\n",
       "0  https://www.linkedin.com/jobs/view/account-exe...   \n",
       "1  https://www.linkedin.com/jobs/view/registered-...   \n",
       "2  https://www.linkedin.com/jobs/view/restaurant-...   \n",
       "3  https://www.linkedin.com/jobs/view/independent...   \n",
       "4  https://www.linkedin.com/jobs/view/group-unit-...   \n",
       "\n",
       "             last_processed_time got_summary got_ner is_being_worked  \\\n",
       "0   2024-01-21 07:12:29.00256+00           t       t               f   \n",
       "1   2024-01-21 07:39:58.88137+00           t       t               f   \n",
       "2  2024-01-21 07:40:00.251126+00           t       t               f   \n",
       "3  2024-01-21 07:40:00.308133+00           t       t               f   \n",
       "4  2024-01-19 09:45:09.215838+00           f       f               f   \n",
       "\n",
       "                                           job_title  \\\n",
       "0  Account Executive - Dispensing (NorCal/Norther...   \n",
       "1                 Registered Nurse - RN Care Manager   \n",
       "2               RESTAURANT SUPERVISOR - THE FORKLIFT   \n",
       "3                      Independent Real Estate Agent   \n",
       "4  Group/Unit Supervisor (Systems Support Manager...   \n",
       "\n",
       "                        company          job_location  first_seen  \\\n",
       "0                            BD         San Diego, CA  2024-01-15   \n",
       "1             Trinity Health MI     Norton Shores, MI  2024-01-14   \n",
       "2       Wasatch Adaptive Sports             Sandy, UT  2024-01-14   \n",
       "3    Howard Hanna | Rand Realty  Englewood Cliffs, NJ  2024-01-16   \n",
       "4  IRS, Office of Chief Counsel          Chamblee, GA  2024-01-17   \n",
       "\n",
       "   search_city search_country                       search_position  \\\n",
       "0     Coronado  United States                           Color Maker   \n",
       "1  Grand Haven  United States              Director Nursing Service   \n",
       "2       Tooele  United States                              Stand-In   \n",
       "3    Pinehurst  United States                     Real-Estate Clerk   \n",
       "4      Gadsden  United States  Supervisor Travel-Information Center   \n",
       "\n",
       "    job_level job_type                                        job_summary  \\\n",
       "0  Mid senior   Onsite  Responsibilities\\nJob Description Summary\\nJob...   \n",
       "1  Mid senior   Onsite  Employment Type:\\nFull time\\nShift:\\nDescripti...   \n",
       "2  Mid senior   Onsite  Job Details\\nDescription\\nWhat You'll Do\\nAs a...   \n",
       "3  Mid senior   Onsite  Who We Are\\nRand Realty is a family-owned brok...   \n",
       "4  Mid senior   Onsite                                               None   \n",
       "\n",
       "  summary_length  \n",
       "0           4602  \n",
       "1           2950  \n",
       "2           4571  \n",
       "3           3944  \n",
       "4           <NA>  "
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "df_merged=pd.merge(job_postings_df, job_summary_df, how=\"left\", on=\"job_link\")\n",
    "df_merged.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "0160a559-2b17-40a6-ad9d-34ce746236d0",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 490
    },
    "id": "0160a559-2b17-40a6-ad9d-34ce746236d0",
    "outputId": "e397c28b-a90d-42d2-8a9a-4c6260c45b38"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 33.2 ms, sys: 17.3 ms, total: 50.6 ms\n",
      "Wall time: 120 ms\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>summary_length</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>company</th>\n",
       "      <th>job_title</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>ClickJobs.io</th>\n",
       "      <th>Adolescent Behavioral Health Therapist - Substance Use Specialty (Entry Senior Level) Psychiatry</th>\n",
       "      <td>23748.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Mt. San Antonio College</th>\n",
       "      <th>Chief, Police and Campus Safety</th>\n",
       "      <td>22998.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>CareerBeacon</th>\n",
       "      <th>Airside/Groundside Project Manager [Halifax International Airport Authority]</th>\n",
       "      <td>22938.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Tacoma Community College</th>\n",
       "      <th>Anthropology Professor - Part-time</th>\n",
       "      <td>22790.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>IRS, Office of Chief Counsel</th>\n",
       "      <th>Program Analyst (12-Month Roster)</th>\n",
       "      <td>22774.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"4\" valign=\"top\">鴻海精密工業股份有限公司</th>\n",
       "      <th>HR Specialist - Payroll &amp; Benefit</th>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Material Planner</th>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>RFQ Specialist</th>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Supply Chain Program Manager</th>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>🌟Daniel-Scott Recruitment Ltd🌟</th>\n",
       "      <th>IT Manager</th>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>801276 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                   summary_length\n",
       "company                        job_title                                                         \n",
       "ClickJobs.io                   Adolescent Behavioral Health Therapist - Substa...         23748.0\n",
       "Mt. San Antonio College        Chief, Police and Campus Safety                            22998.0\n",
       "CareerBeacon                   Airside/Groundside Project Manager [Halifax Int...         22938.0\n",
       "Tacoma Community College       Anthropology Professor - Part-time                         22790.0\n",
       "IRS, Office of Chief Counsel   Program Analyst (12-Month Roster)                          22774.0\n",
       "...                                                                                           ...\n",
       "鴻海精密工業股份有限公司                   HR Specialist - Payroll & Benefit                              0.0\n",
       "                               Material Planner                                               0.0\n",
       "                               RFQ Specialist                                                 0.0\n",
       "                               Supply Chain Program Manager                                   0.0\n",
       "🌟Daniel-Scott Recruitment Ltd🌟 IT Manager                                                     0.0\n",
       "\n",
       "[801276 rows x 1 columns]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "df_merged.groupby(['company',\"job_title\"]).agg({\n",
    "    \"summary_length\":\"mean\"}).sort_values(by='summary_length', ascending = False).fillna(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "IME4urGYQ3qS",
   "metadata": {
    "id": "IME4urGYQ3qS"
   },
   "source": [
    "We went down from around 5 seconds to less than a second here. This is in line with our speedups on other operations!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "adc00726-f151-41f4-8731-a1ce1f83eea2",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 458
    },
    "id": "adc00726-f151-41f4-8731-a1ce1f83eea2",
    "outputId": "46423696-b167-4ffe-bb3b-9de7f3e6d668"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 13.7 ms, sys: 20.3 ms, total: 34 ms\n",
      "Wall time: 156 ms\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>job_title</th>\n",
       "      <th>job_location</th>\n",
       "      <th>summary_length</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>🔥Nurse Manager, Patient Services - Operating Room</td>\n",
       "      <td>Lake George, NY</td>\n",
       "      <td>7342.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>🔥Behavioral Health RN 3 12s</td>\n",
       "      <td>Glens Falls, NY</td>\n",
       "      <td>2787.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>🔥 Surgical Technologist - Evenings</td>\n",
       "      <td>Lake George, NY</td>\n",
       "      <td>2920.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>🔥 Physician Practice Clinical Lead RN</td>\n",
       "      <td>Saratoga Springs, NY</td>\n",
       "      <td>2945.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>🔥 Physican Practice LPN - Green</td>\n",
       "      <td>Lake George, NY</td>\n",
       "      <td>2969.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1104106</th>\n",
       "      <td>\"Attorney\" (Gov Appt/Non-Merit) Jobs</td>\n",
       "      <td>Kentucky, United States</td>\n",
       "      <td>2427.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1104107</th>\n",
       "      <td>\"Accountant\"</td>\n",
       "      <td>Shavano Park, TX</td>\n",
       "      <td>1497.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1104108</th>\n",
       "      <td>\"Accountant\"</td>\n",
       "      <td>Basking Ridge, NJ</td>\n",
       "      <td>1073.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1104109</th>\n",
       "      <td>\"Accountant\"</td>\n",
       "      <td>Austin, TX</td>\n",
       "      <td>1993.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1104110</th>\n",
       "      <td>\"A\" Softball Coach - Central Middle School</td>\n",
       "      <td>East Corinth, ME</td>\n",
       "      <td>718.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1104111 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                 job_title  \\\n",
       "0        🔥Nurse Manager, Patient Services - Operating Room   \n",
       "1                              🔥Behavioral Health RN 3 12s   \n",
       "2                       🔥 Surgical Technologist - Evenings   \n",
       "3                    🔥 Physician Practice Clinical Lead RN   \n",
       "4                          🔥 Physican Practice LPN - Green   \n",
       "...                                                    ...   \n",
       "1104106               \"Attorney\" (Gov Appt/Non-Merit) Jobs   \n",
       "1104107                                       \"Accountant\"   \n",
       "1104108                                       \"Accountant\"   \n",
       "1104109                                       \"Accountant\"   \n",
       "1104110         \"A\" Softball Coach - Central Middle School   \n",
       "\n",
       "                    job_location  summary_length  \n",
       "0                Lake George, NY          7342.0  \n",
       "1                Glens Falls, NY          2787.0  \n",
       "2                Lake George, NY          2920.0  \n",
       "3           Saratoga Springs, NY          2945.0  \n",
       "4                Lake George, NY          2969.0  \n",
       "...                          ...             ...  \n",
       "1104106  Kentucky, United States          2427.0  \n",
       "1104107         Shavano Park, TX          1497.0  \n",
       "1104108        Basking Ridge, NJ          1073.0  \n",
       "1104109               Austin, TX          1993.0  \n",
       "1104110         East Corinth, ME           718.0  \n",
       "\n",
       "[1104111 rows x 3 columns]"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "# Group by company, job_title, and month, and calculate the mean of summary_length\n",
    "grouped_df = df_merged.groupby(['job_title', 'job_location']).agg({'summary_length': 'mean'})\n",
    "\n",
    "# Reset index to sort by job_title and month\n",
    "grouped_df = grouped_df.reset_index()\n",
    "\n",
    "# Sort by job_title and month\n",
    "sorted_df = grouped_df.sort_values(by=['job_title', 'job_location','summary_length'],\n",
    "                                   ascending=False).reset_index(drop=True).fillna(0)\n",
    "sorted_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "08c97b81-64c5-48fb-8fe0-d36789cf3deb",
   "metadata": {
    "id": "08c97b81-64c5-48fb-8fe0-d36789cf3deb"
   },
   "source": [
    "The acceleration is consistently 10x+ for complex aggregations and sorting that involve multiple columns."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9bcc719b-666a-4bc9-97d6-16f448b5c707",
   "metadata": {
    "id": "9bcc719b-666a-4bc9-97d6-16f448b5c707"
   },
   "source": [
    "# Summary\n",
    "\n",
    "With cudf.pandas, you can keep using pandas as your primary dataframe library. When things start to get a little slow, just load the `cudf.pandas` extension and enjoy the incredible speedups.\n",
    "\n",
    "To learn more about cudf.pandas, we encourage you to visit https://rapids.ai/cudf-pandas."
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "T4",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/nvidia/cuda-x-data-science/assets/cuml_sklearn_demo.ipynb
+++ b/nvidia/cuda-x-data-science/assets/cuml_sklearn_demo.ipynb
--- a/nvidia/vibe-coding/README.md
+++ b/nvidia/vibe-coding/README.md
@ -0,0 +1,190 @@
 # Vibe Coding in VS Code
 > Use DGX Spark as a local or remote Vibe Coding assistant with Ollama and Continue
 ## Table of Contents
 - [Overview](#overview)
  - [What You'll Accomplish](#what-youll-accomplish)
  - [Prerequisites](#prerequisites)
  - [Requirements](#requirements)
 - [Instructions](#instructions)
 - [Troubleshooting](#troubleshooting)
 ---
 ## Overview
 ## DGX Spark Vibe Coding
 This playbook walks you through setting up DGX Spark as a **Vibe Coding assistant** — locally or as a remote coding companion for VSCode with Continue.dev.  
 This guide uses **Ollama** with **GPT-OSS 120B** to provide easy deployment of a coding assistant to VSCode. Included is advanced instructions to allow DGX Spark and Ollama to provide the coding assistant to be available over your local network. This guide is also written on a **fresh installation* of the OS. If your OS is not freshly installed and you have issues, see the troubleshooting section at the bottom of the document.
 ### What You'll Accomplish
 You'll have a fully configured DGX Spark system capable of:
 - Running local code assistance through Ollama.
 - Serving models remotely for Continue and VSCode integration.
 - Hosting large LLMs like GPT-OSS 120B using unified memory.
 ### Prerequisites
 - DGX Spark (128GB unified memory recommended)
 - Internet access for model downloads
 - Basic familiarity with the terminal
 - Optional: firewall control for remote access configuration
 ### Requirements
 - **Ollama** and an LLM of your choice (e.g., `gpt-oss:120b`)
 - **VSCode**
 - **Continue** VSCode extension
 - Basic familiarity with opening the Linux terminal, copying and pasting commands.
 - Having sudo access.
 ## Instructions
 ## Step 1. Install Ollama
 Install the latest version of Ollama using the following command:
 ```bash
 curl -fsSL https://ollama.com/install.sh | sh
 ```
 Once the service is running, pull the desired model:
 ```bash
 ollama pull gpt-oss:120b
 ```
 ## Step 2. (Optional) Enable Remote Access
 To allow remote connections (e.g., from a workstation using VSCode and Continue), modify the Ollama systemd service:
 ```bash
 sudo systemctl edit ollama
 ```
 Add the following lines beneath the commented section:
 ```ini
 [Service]
 Environment="OLLAMA_HOST=0.0.0.0:11434"
 Environment="OLLAMA_ORIGINS=*"
 ```
 Reload and restart the service:
 ```bash
 sudo systemctl daemon-reload
 sudo systemctl restart ollama
 ```
 If using a firewall, open port 11434:
 ```bash
 sudo ufw allow 11434/tcp
 ```
 Verify that the workstation can connect to your DGX Spark's Ollama server:
  ```bash
  curl -v http://YOUR_SPARK_IP:11434/api/version
  ```
 Replace YOUR_SPARK_IP with your DGX Spark's IP address.
 If the connection fails please see the troubleshooting section at the bottom of this document.
 ## Step 3. Install VSCode
 For DGX Spark (ARM-based), download and install VSCode:
  Navigate to https://code.visualstudio.com/download and download the Linux ARM64 version of VSCode. After
  the download completes note the downloaded package name. Use it in the next command in place of DOWNLOADED_PACKAGE_NAME.
 ```bash
 sudo dpkg -i DOWNLOADED_PACKAGE_NAME
 ```
 If using a remote workstation, **install VSCode appropriate for your system architecture**.
 ## Step 4. Install Continue.dev Extension
 Open VSCode and install **Continue.dev** from the Marketplace.  
 After installation, click the Continue icon on the right-hand bar.
 ## Step 5. Local Inference Setup
 - Click Select **Or, configure your own models**
 - Click **Click here to view more providers**
 - Choose **Ollama** as the provider.
 - For **Model**, select **Autodetect**.
 - Test inference by sending a test prompt.
 Your downloaded model will now be the default (e.g., `gpt-oss:120b`) for inference.
 ## Step 6. Setting up a Workstation to Connect to the DGX Spark' Ollama Server
 To connect a workstation running VSCode to a remote DGX Spark instance the following must be completed on that workstation:
  - Install Continue from the marketplace.
  - Click on the Continue icon on the left pane.
  - Click ***Or, configure your own models***
  - Click **Click here to view more providers.
  - Select ***Ollama*** from the provider list.
  - Select ***Autodetect*** as the model.
 Continue **wil** fail to detect the model as it is attempting to connect to a locally hosted Ollama server.
  - Find the **gear** icon in the upper right corner of the chat window and click on it.
  - On the left pane, click **Models**
  - Next to the first dropdown menu under **Chat** click the gear icon.
  - Continue's config.yaml will open. Take note of your DGX Spark's IP address.
  - Replace the configuration with the following. **YOUR_SPARK_IP** should be replaced with your DGX Spark's IP.
 ```yaml
 name: Config
 version: 1.0.0
 schema: v1
 assistants:
  - name: default
    model: OllamaSpark
 models:
  - name: OllamaSpark
    provider: ollama
    model: gpt-oss:120b
    apiBase: http://YOUR_SPARK_IP:11434
    title: gpt-oss:120b
    roles:
      - chat
      - edit
      - autocomplete
 ```
 Replace `YOUR_SPARK_IP` with the IP address of your DGX Spark.  
 Add additional model entries for any other Ollama models you wish to host remotely.
 ## Troubleshooting
 ## Common Issues
 **1. Ollama not starting**
 - Verify GPU drivers are installed correctly.
  Run `nvidia-smi` in the terminal. If the command fails check DGX Dashboard for updates to your DGX Spark.
  If there are no updates or updates do not correct the issue, create a thread on the DGX Spark/GB10 User forum here :
    https://forums.developer.nvidia.com/c/accelerated-computing/dgx-spark-gb10/dgx-spark-gb10/
 - Run `ollama serve` on the DGX Spark to view Ollama logs.
 **2. Continue can't connect over the network**
 - Ensure port 11434 is open and accessible from your workstation. 
    ```bash
    ss -tuln | grep 11434
    ```
      If the output does not reflect " tcp   LISTEN 0      4096               *:11434            *:*  "
      go back to step 2 and run the ufw command.
 **3. Continue can't detect a locally running Ollama model
 - Check `OLLAMA_HOST` and `OLLAMA_ORIGINS` in `/etc/systemd/system/ollama.service.d/override.conf`.
 - If `OLLAMA_HOST` and `OLLAMA_ORIGINS` are set correctly you should add these lines to your .bashrc.
 **4. High memory usage**
 - Use smaller models such as `gpt-oss:20b` for lightweight usage.
 - Confirm no other large models or containers are running with `nvidia-smi`.
--- a/nvidia/vllm/README.md
+++ b/nvidia/vllm/README.md
@ -118,7 +118,7 @@ sudo /usr/local/cuda-12.9/bin/cuda-uninstaller
 ## Step 1. Configure network connectivity
-Follow the network setup instructions from the [Connect two Sparks](https://build.nvidia.com/spark/stack-sparks/stacked-sparks) playbook to establish connectivity between your DGX Spark nodes.
+Follow the network setup instructions from the [Connect two Sparks](https://build.nvidia.com/spark/connect-two-sparks) playbook to establish connectivity between your DGX Spark nodes.
 This includes:
 - Physical QSFP cable connection
@ -340,7 +340,7 @@ http://192.168.100.10:8265
 | Container registry authentication fails | Invalid or expired GitLab token | Generate new auth token |
 | SM_121a architecture not recognized | Missing LLVM patches | Verify SM_121a patches applied to LLVM source |
-## Common Issues for running on two Starks
+## Common Issues for running on two Sparks
 | Symptom | Cause | Fix |
 |---------|--------|-----|
 | Node 2 not visible in Ray cluster | Network connectivity issue | Verify QSFP cable connection, check IP configuration |
--- a/nvidia/vss/README.md
+++ b/nvidia/vss/README.md
@ -1,4 +1,4 @@
-# Video Search and Summarization
+# Build a Video Search and Summarization (VSS) Agent
 > Run the VSS Blueprint on your Spark
@ -30,8 +30,8 @@ You will deploy NVIDIA's VSS AI Blueprint on NVIDIA Spark hardware with Blackwel
 ## Prerequisites
 - NVIDIA Spark device with ARM64 architecture and Blackwell GPU
- FastOS 1.81.38 or compatible ARM64 system
+- NVIDIA DGX OS 7.2.3 or higher
- Driver version 580.82.09 or higher installed: `nvidia-smi | grep "Driver Version"`
+- Driver version 580.95.05 or higher installed: `nvidia-smi | grep "Driver Version"`
 - CUDA version 13.0 installed: `nvcc --version`
 - Docker installed and running: `docker --version && docker compose version`
 - Access to NVIDIA Container Registry with [NGC API Key](https://org.ngc.nvidia.com/setup/api-keys)
@ -278,6 +278,10 @@ Open these URLs in your browser:
 In this hybrid deployment, we would use NIMs from [build.nvidia.com](https://build.nvidia.com/). Alternatively, you can configure your own hosted endpoints by following the instructions in the [VSS remote deployment guide](https://docs.nvidia.com/vss/latest/content/installation-remote-docker-compose.html).
 > [!NOTE]
 > Fully local deployment using smaller LLM (Llama 3.1 8B) is also possible.  
 > To set up a fully local VSS deployment, follow the [instructions in the VSS documentation](https://docs.nvidia.com/vss/latest/content/vss_dep_docker_compose_arm.html#local-deployment-single-gpu-dgx-spark).
 **9.1 Get NVIDIA API Key**
 - Log in to https://build.nvidia.com/explore/discover.