mirror of
https://github.com/NVIDIA/dgx-spark-playbooks.git
synced 2026-04-22 01:53:53 +00:00
chore: Regenerate all playbooks
This commit is contained in:
parent
e17deb3167
commit
34239a8313
@ -24,12 +24,10 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
|
||||
- [Comfy UI](nvidia/comfy-ui/)
|
||||
- [Set Up Local Network Access](nvidia/connect-to-your-spark/)
|
||||
- [Connect Two Sparks](nvidia/connect-two-sparks/)
|
||||
- [CUDA-X Data Science](nvidia/cuda-x-data-science/)
|
||||
- [DGX Dashboard](nvidia/dgx-dashboard/)
|
||||
- [FLUX.1 Dreambooth LoRA Fine-tuning](nvidia/flux-finetuning/)
|
||||
- [Optimized JAX](nvidia/jax/)
|
||||
- [LLaMA Factory](nvidia/llama-factory/)
|
||||
- [MONAI Reasoning Model](nvidia/monai-reasoning/)
|
||||
- [Build and Deploy a Multi-Agent Chatbot](nvidia/multi-agent-chatbot/)
|
||||
- [Multi-modal Inference](nvidia/multi-modal-inference/)
|
||||
- [NCCL for Two Sparks](nvidia/nccl/)
|
||||
@ -38,16 +36,13 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting
|
||||
- [NVFP4 Quantization](nvidia/nvfp4-quantization/)
|
||||
- [Ollama](nvidia/ollama/)
|
||||
- [Open WebUI with Ollama](nvidia/open-webui/)
|
||||
- [Use Open Fold](nvidia/protein-folding/)
|
||||
- [Fine tune with Pytorch](nvidia/pytorch-fine-tune/)
|
||||
- [RAG application in AI Workbench](nvidia/rag-ai-workbench/)
|
||||
- [SGLang Inference Server](nvidia/sglang/)
|
||||
- [Speculative Decoding](nvidia/speculative-decoding/)
|
||||
- [Set up Tailscale on your Spark](nvidia/tailscale/)
|
||||
- [TRT LLM for Inference](nvidia/trt-llm/)
|
||||
- [Text to Knowledge Graph](nvidia/txt2kg/)
|
||||
- [Unsloth on DGX Spark](nvidia/unsloth/)
|
||||
- [Vibe Coding in VS Code](nvidia/vibe-coding/)
|
||||
- [Install and Use vLLM for Inference](nvidia/vllm/)
|
||||
- [Vision-Language Model Fine-tuning](nvidia/vlm-finetuning/)
|
||||
- [VS Code](nvidia/vscode/)
|
||||
|
||||
@ -1,73 +0,0 @@
|
||||
# CUDA-X Data Science
|
||||
|
||||
> Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Instructions](#instructions)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
## Basic Idea
|
||||
This playbook includes two example notebooks that demonstrate the acceleration of key machine learning algorithms and core pandas operations using CUDA-X Data Science libraries:
|
||||
|
||||
- **NVIDIA cuDF:** Accelerates operations for data preparation and core data processing of 8GB of strings data, with no code changes.
|
||||
- **NVIDIA cuML:** Accelerates popular, compute intensive machine learning algorithms in sci-kit learn (LinearSVC), UMAP, and HDBSCAN, with no code changes.
|
||||
|
||||
CUDA-X Data Science (formally RAPIDS) is an open-source library collection that accelerates the data science and data processing ecosystem. These libraries accelerate popular Python tools like scikit-learn and pandas with zero code changes. On DGX Spark, these libraries maximize performance at your desk with your existing code.
|
||||
|
||||
## What you'll accomplish
|
||||
You will accelerate popular machine learning algorithms and data analytics operations GPU. You will understand how to accelerate popular Python tools, and the value of running data science workflows on your DGX Spark.
|
||||
|
||||
## Prerequisites
|
||||
- Familiarity with pandas, scikit-learn, machine learning algorithms, such as support vector machine, clustering, and dimensionality reduction algorithms.
|
||||
- Install conda
|
||||
- Generate a Kaggle API key
|
||||
|
||||
## Time & risk
|
||||
- Duration:
|
||||
- 20-30 minutes setup time.
|
||||
- 2-3 minutes to run each notebook.
|
||||
|
||||
## Instructions
|
||||
|
||||
## Step 1. Verify system requirements
|
||||
- Verify the system has CUDA 13 installed
|
||||
- Verify the python version is greater than 3.10
|
||||
- Install conda using [these instructions](https://docs.anaconda.com/miniconda/install/)
|
||||
- Create Kaggle API key using [these instructions](https://www.kaggle.com/discussions/general/74235) and place the **kaggle.json** file in the same folder as the notebook
|
||||
|
||||
## Step 2. Installing Data Science libraries
|
||||
- Use the following command to install the CUDA-X libraries (this will create a new conda environment)
|
||||
```bash
|
||||
conda create -n rapids-test -c rapidsai-nightly -c conda-forge -c nvidia \
|
||||
rapids=25.10 python=3.12 'cuda-version=13.0' \
|
||||
jupyterlab hdbscan umap-learn
|
||||
```
|
||||
## Step 3. Activate the conda environment
|
||||
- Activate the conda environment
|
||||
```bash
|
||||
conda activate rapids-test
|
||||
```
|
||||
## Step 4. Cloning the playbook repository
|
||||
- Clone the github repository and go the assets folder place in cuda-x-data-science folder
|
||||
```bash
|
||||
git clone https://github.com/NVIDIA/dgx-spark-playbooks
|
||||
```
|
||||
- Place the **kaggle.json** created in Step 1 in the assets folder
|
||||
|
||||
## Step 5. Run the notebooks
|
||||
There are two notebooks in the GitHub repository.
|
||||
One runs an example of a large strings data processing workflow with pandas code on GPU.
|
||||
- Run the cudf_pandas_demo.ipynb notebook
|
||||
```bash
|
||||
jupyter notebook cudf_pandas_demo.ipynb
|
||||
```
|
||||
The other goes over an example of machine learning algorithms including UMAP and HDBSCAN.
|
||||
- Run the cuml_sklearn_demo.ipynb notebook
|
||||
```bash
|
||||
jupyter notebook cuml_sklearn_demo.ipynb
|
||||
```
|
||||
@ -1,969 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "84635d55-68a2-468b-ac09-9029ebdab55f",
|
||||
"metadata": {
|
||||
"id": "84635d55-68a2-468b-ac09-9029ebdab55f"
|
||||
},
|
||||
"source": [
|
||||
"# Accelerating large string data processing with cudf pandas accelerator mode (cudf.pandas)\n",
|
||||
"<a href=\"https://github.com/rapidsai/cudf\">cuDF</a> is a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating, filtering, and otherwise manipulating tabular data using a DataFrame style API in the style of pandas.\n",
|
||||
"\n",
|
||||
"cuDF now provides a <a href=\"https://rapids.ai/cudf-pandas/\">pandas accelerator mode</a> (`cudf.pandas`), allowing you to bring accelerated computing to your pandas workflows without requiring any code change.\n",
|
||||
"\n",
|
||||
"This notebook demonstrates how cuDF pandas accelerator mode can help accelerate processing of datasets with large string fields (4 GB+) processing by simply adding a `%load_ext` command. We have introduced this feature as part of our Rapids 24.08 release.\n",
|
||||
"\n",
|
||||
"**Author:** Allison Ding, Mitesh Patel <br>\n",
|
||||
"**Date:** October 3, 2025"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bb8fe7ab-c055-40e9-897d-c62c72f28a16",
|
||||
"metadata": {
|
||||
"id": "bb8fe7ab-c055-40e9-897d-c62c72f28a16"
|
||||
},
|
||||
"source": [
|
||||
"# ⚠️ Verify your setup\n",
|
||||
"\n",
|
||||
"First, we'll verify that you are running with an NVIDIA GPU."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "a88b8586-cfdd-4d31-9b4d-9be8508f7ba0",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "a88b8586-cfdd-4d31-9b4d-9be8508f7ba0",
|
||||
"outputId": "18525b64-b34b-40e3-ed3a-1ad56ae794b5"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Fri Oct 3 23:16:52 2025 \n",
|
||||
"+-----------------------------------------------------------------------------------------+\n",
|
||||
"| NVIDIA-SMI 580.82.09 Driver Version: 580.82.09 CUDA Version: 13.0 |\n",
|
||||
"+-----------------------------------------+------------------------+----------------------+\n",
|
||||
"| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |\n",
|
||||
"| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |\n",
|
||||
"| | | MIG M. |\n",
|
||||
"|=========================================+========================+======================|\n",
|
||||
"| 0 NVIDIA GB10 Off | 0000000F:01:00.0 Off | N/A |\n",
|
||||
"| N/A 44C P0 10W / N/A | Not Supported | 0% Default |\n",
|
||||
"| | | N/A |\n",
|
||||
"+-----------------------------------------+------------------------+----------------------+\n",
|
||||
"\n",
|
||||
"+-----------------------------------------------------------------------------------------+\n",
|
||||
"| Processes: |\n",
|
||||
"| GPU GI CI PID Type Process name GPU Memory |\n",
|
||||
"| ID ID Usage |\n",
|
||||
"|=========================================================================================|\n",
|
||||
"| 0 N/A N/A 3405 G /usr/lib/xorg/Xorg 242MiB |\n",
|
||||
"| 0 N/A N/A 3562 G /usr/bin/gnome-shell 53MiB |\n",
|
||||
"| 0 N/A N/A 214921 C .../envs/rapids-25.10/bin/python 196MiB |\n",
|
||||
"+-----------------------------------------------------------------------------------------+\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!nvidia-smi # this should display information about available GPUs"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5cd58071-4371-428b-8a02-9cd66e6cb91f",
|
||||
"metadata": {
|
||||
"id": "5cd58071-4371-428b-8a02-9cd66e6cb91f"
|
||||
},
|
||||
"source": [
|
||||
"# Download the data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9eb67713-7cf4-415a-bce7-ff4695862faa",
|
||||
"metadata": {
|
||||
"id": "9eb67713-7cf4-415a-bce7-ff4695862faa"
|
||||
},
|
||||
"source": [
|
||||
"## Overview\n",
|
||||
"The data we'll be working with summarizes job postings data that a developer working at a job listing firm might analyze to understand posting trends.\n",
|
||||
"\n",
|
||||
"We'll need to download a curated copy of this [Kaggle dataset](https://www.kaggle.com/datasets/asaniczka/1-3m-linkedin-jobs-and-skills-2024/data?select=job_summary.csv) directly from the kaggle API. \n",
|
||||
"\n",
|
||||
"**Data License and Terms** <br>\n",
|
||||
"As this dataset originates from a Kaggle dataset, it's governed by that dataset's license and terms of use, which is the Open Data Commons license. Review here:https://opendatacommons.org/licenses/by/1-0/index.html. For each dataset an user elects to use, the user is responsible for checking if the dataset license is fit for the intended purpose.\n",
|
||||
"\n",
|
||||
"**Are there restrictions on how I can use this data? </br>**\n",
|
||||
"For each dataset an user elects to use, the user is responsible for checking if the dataset license is fit for the intended purpose.\n",
|
||||
"\n",
|
||||
"## Get the Data\n",
|
||||
"First, [please follow these instructions from Kaggle to download and/or updating your Kaggle API token to get acces the dataset](https://www.kaggle.com/discussions/general/74235). \n",
|
||||
"\n",
|
||||
"Once generated, make sure to have the **kaggle.json** file in the same folder as the notebook\n",
|
||||
"\n",
|
||||
"Next, run this code below, which should also take 1-2 minutes:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "406838c6-267c-423e-82ab-ea13d5fa9c90",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Requirement already satisfied: kaggle in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (1.7.4.5)\n",
|
||||
"Requirement already satisfied: bleach in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (6.2.0)\n",
|
||||
"Requirement already satisfied: certifi>=14.05.14 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2025.8.3)\n",
|
||||
"Requirement already satisfied: charset-normalizer in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (3.4.3)\n",
|
||||
"Requirement already satisfied: idna in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (3.10)\n",
|
||||
"Requirement already satisfied: protobuf in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (6.32.1)\n",
|
||||
"Requirement already satisfied: python-dateutil>=2.5.3 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2.9.0.post0)\n",
|
||||
"Requirement already satisfied: python-slugify in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (8.0.4)\n",
|
||||
"Requirement already satisfied: requests in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2.32.5)\n",
|
||||
"Requirement already satisfied: setuptools>=21.0.0 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (80.9.0)\n",
|
||||
"Requirement already satisfied: six>=1.10 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (1.17.0)\n",
|
||||
"Requirement already satisfied: text-unidecode in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (1.3)\n",
|
||||
"Requirement already satisfied: tqdm in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (4.67.1)\n",
|
||||
"Requirement already satisfied: urllib3>=1.15.1 in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (2.5.0)\n",
|
||||
"Requirement already satisfied: webencodings in /home/nvidia/miniconda3/envs/rapids-25.10/lib/python3.12/site-packages (from kaggle) (0.5.1)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!pip install kaggle\n",
|
||||
"!mkdir -p ~/.kaggle\n",
|
||||
"!cp kaggle.json ~/.kaggle/\n",
|
||||
"!chmod 600 ~/.kaggle/kaggle.json"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"id": "3efacb3c-5f3d-4ff0-b32a-76bbb80b5f74",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "3efacb3c-5f3d-4ff0-b32a-76bbb80b5f74",
|
||||
"outputId": "5fe4a878-cf57-44f9-e40e-ed413035b150"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Download the dataset through kaggle API-\n",
|
||||
"!kaggle datasets download -d asaniczka/1-3m-linkedin-jobs-and-skills-2024\n",
|
||||
"#unzip the file to access contents\n",
|
||||
"!unzip 1-3m-linkedin-jobs-and-skills-2024.zip"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2__ZMVe6LaBJ",
|
||||
"metadata": {
|
||||
"id": "2__ZMVe6LaBJ"
|
||||
},
|
||||
"source": [
|
||||
"# Analysis with cuDF Pandas"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "df47f304-2b30-4380-afd5-0613b63d103d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The magic command `%load_ext cudf.pandas` enables GPU acceleration for pandas data processing in a Jupyter notebook, allowing most pandas operations to automatically execute on NVIDIA GPUs for improved performance. \n",
|
||||
"\n",
|
||||
"With this extension loaded before importing pandas, your code can use standard pandas syntax while gaining the benefits of GPU speedup, automatically falling back to CPU execution for operations not supported on the GPU. This provides a seamless way to accelerate existing pandas workflows with zero code changes, especially for large data analytics tasks or machine learning preprocessing."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "e5cd2520-30a6-41c1-b7c5-5abe0eb90d82",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%load_ext cudf.pandas"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "eadb8d77-cb45-4c7c-ae9f-77e47a4f29b3",
|
||||
"metadata": {
|
||||
"id": "eadb8d77-cb45-4c7c-ae9f-77e47a4f29b3"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import pandas as pd\n",
|
||||
"import numpy as np"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "196268f2-6169-4ed7-a9e6-db9078caa6ab",
|
||||
"metadata": {
|
||||
"id": "196268f2-6169-4ed7-a9e6-db9078caa6ab"
|
||||
},
|
||||
"source": [
|
||||
"We'll run a piece of code to get a feel what GPU-acceleration brings to pandas workflows."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "2688bfeb-58c4-4fc0-9233-8d7e2759ec46",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import time\n",
|
||||
"start_time = time.time()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "ae3b6a16-ff72-4421-b43c-06c33f57ec12",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "ae3b6a16-ff72-4421-b43c-06c33f57ec12",
|
||||
"outputId": "656acbf7-078f-42b3-832d-ad4e84e01c70"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 185 ms, sys: 2.08 s, total: 2.27 s\n",
|
||||
"Wall time: 2.95 s\n",
|
||||
"Dataset Size (in GB): 4.76\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%time job_summary_df = pd.read_csv(\"job_summary.csv\", dtype=('str'))\n",
|
||||
"print(\"Dataset Size (in GB):\",round(job_summary_df.memory_usage(\n",
|
||||
" deep=True).sum()/(1024**3),2))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "01c506e1-f135-4afb-8fc7-23e72c05d73c",
|
||||
"metadata": {
|
||||
"id": "01c506e1-f135-4afb-8fc7-23e72c05d73c"
|
||||
},
|
||||
"source": [
|
||||
"The same dataset takes about around 1.5 minutes to load with pandas. That's around **5x speedup** with no changes to the code!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d9d0a0e1-1d74-494d-bd12-b829f11eeede",
|
||||
"metadata": {
|
||||
"id": "d9d0a0e1-1d74-494d-bd12-b829f11eeede"
|
||||
},
|
||||
"source": [
|
||||
"Let's load the remaining two datasets as well:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "12e4cf7e-8824-4822-9d30-46b81ba2acd7",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "12e4cf7e-8824-4822-9d30-46b81ba2acd7",
|
||||
"outputId": "5ca1be17-09e3-40ab-928b-82176bf597bf"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 45.3 ms, sys: 199 ms, total: 244 ms\n",
|
||||
"Wall time: 354 ms\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"job_skills_df = pd.read_csv(\"job_skills.csv\", dtype=('str'))\n",
|
||||
"job_postings_df = pd.read_csv(\"linkedin_job_postings.csv\", dtype=('str'))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 38,
|
||||
"id": "13c8f9da-121f-4311-8a79-274425363e5e",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/",
|
||||
"height": 276
|
||||
},
|
||||
"id": "13c8f9da-121f-4311-8a79-274425363e5e",
|
||||
"outputId": "a73599c1-05b2-4f56-a190-c69c017bb330"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 4.46 ms, sys: 3.1 ms, total: 7.56 ms\n",
|
||||
"Wall time: 46.3 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"0 957\n",
|
||||
"1 3816\n",
|
||||
"2 5314\n",
|
||||
"3 2774\n",
|
||||
"4 2749\n",
|
||||
"Name: summary_length, dtype: int32"
|
||||
]
|
||||
},
|
||||
"execution_count": 38,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"job_summary_df['summary_length'] = job_summary_df['job_summary'].str.len()\n",
|
||||
"job_summary_df['summary_length'].head()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "67b68792-5c64-4ebd-9d80-cf6ff55baeef",
|
||||
"metadata": {
|
||||
"id": "67b68792-5c64-4ebd-9d80-cf6ff55baeef"
|
||||
},
|
||||
"source": [
|
||||
"That was lightning fast! We went from around 10+ (with pandas) to a few milliseconds."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 39,
|
||||
"id": "31e1cc84-debb-4da7-bc20-5c7139f786f7",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/",
|
||||
"height": 504
|
||||
},
|
||||
"id": "31e1cc84-debb-4da7-bc20-5c7139f786f7",
|
||||
"outputId": "2d89fc49-7e5b-41db-c25b-441d54480711"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 39.8 ms, sys: 30 ms, total: 69.8 ms\n",
|
||||
"Wall time: 211 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<div>\n",
|
||||
"<style scoped>\n",
|
||||
" .dataframe tbody tr th:only-of-type {\n",
|
||||
" vertical-align: middle;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe tbody tr th {\n",
|
||||
" vertical-align: top;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe thead th {\n",
|
||||
" text-align: right;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<table border=\"1\" class=\"dataframe\">\n",
|
||||
" <thead>\n",
|
||||
" <tr style=\"text-align: right;\">\n",
|
||||
" <th></th>\n",
|
||||
" <th>job_link</th>\n",
|
||||
" <th>last_processed_time</th>\n",
|
||||
" <th>got_summary</th>\n",
|
||||
" <th>got_ner</th>\n",
|
||||
" <th>is_being_worked</th>\n",
|
||||
" <th>job_title</th>\n",
|
||||
" <th>company</th>\n",
|
||||
" <th>job_location</th>\n",
|
||||
" <th>first_seen</th>\n",
|
||||
" <th>search_city</th>\n",
|
||||
" <th>search_country</th>\n",
|
||||
" <th>search_position</th>\n",
|
||||
" <th>job_level</th>\n",
|
||||
" <th>job_type</th>\n",
|
||||
" <th>job_summary</th>\n",
|
||||
" <th>summary_length</th>\n",
|
||||
" </tr>\n",
|
||||
" </thead>\n",
|
||||
" <tbody>\n",
|
||||
" <tr>\n",
|
||||
" <th>0</th>\n",
|
||||
" <td>https://www.linkedin.com/jobs/view/account-exe...</td>\n",
|
||||
" <td>2024-01-21 07:12:29.00256+00</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>f</td>\n",
|
||||
" <td>Account Executive - Dispensing (NorCal/Norther...</td>\n",
|
||||
" <td>BD</td>\n",
|
||||
" <td>San Diego, CA</td>\n",
|
||||
" <td>2024-01-15</td>\n",
|
||||
" <td>Coronado</td>\n",
|
||||
" <td>United States</td>\n",
|
||||
" <td>Color Maker</td>\n",
|
||||
" <td>Mid senior</td>\n",
|
||||
" <td>Onsite</td>\n",
|
||||
" <td>Responsibilities\\nJob Description Summary\\nJob...</td>\n",
|
||||
" <td>4602</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1</th>\n",
|
||||
" <td>https://www.linkedin.com/jobs/view/registered-...</td>\n",
|
||||
" <td>2024-01-21 07:39:58.88137+00</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>f</td>\n",
|
||||
" <td>Registered Nurse - RN Care Manager</td>\n",
|
||||
" <td>Trinity Health MI</td>\n",
|
||||
" <td>Norton Shores, MI</td>\n",
|
||||
" <td>2024-01-14</td>\n",
|
||||
" <td>Grand Haven</td>\n",
|
||||
" <td>United States</td>\n",
|
||||
" <td>Director Nursing Service</td>\n",
|
||||
" <td>Mid senior</td>\n",
|
||||
" <td>Onsite</td>\n",
|
||||
" <td>Employment Type:\\nFull time\\nShift:\\nDescripti...</td>\n",
|
||||
" <td>2950</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>2</th>\n",
|
||||
" <td>https://www.linkedin.com/jobs/view/restaurant-...</td>\n",
|
||||
" <td>2024-01-21 07:40:00.251126+00</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>f</td>\n",
|
||||
" <td>RESTAURANT SUPERVISOR - THE FORKLIFT</td>\n",
|
||||
" <td>Wasatch Adaptive Sports</td>\n",
|
||||
" <td>Sandy, UT</td>\n",
|
||||
" <td>2024-01-14</td>\n",
|
||||
" <td>Tooele</td>\n",
|
||||
" <td>United States</td>\n",
|
||||
" <td>Stand-In</td>\n",
|
||||
" <td>Mid senior</td>\n",
|
||||
" <td>Onsite</td>\n",
|
||||
" <td>Job Details\\nDescription\\nWhat You'll Do\\nAs a...</td>\n",
|
||||
" <td>4571</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>3</th>\n",
|
||||
" <td>https://www.linkedin.com/jobs/view/independent...</td>\n",
|
||||
" <td>2024-01-21 07:40:00.308133+00</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>t</td>\n",
|
||||
" <td>f</td>\n",
|
||||
" <td>Independent Real Estate Agent</td>\n",
|
||||
" <td>Howard Hanna | Rand Realty</td>\n",
|
||||
" <td>Englewood Cliffs, NJ</td>\n",
|
||||
" <td>2024-01-16</td>\n",
|
||||
" <td>Pinehurst</td>\n",
|
||||
" <td>United States</td>\n",
|
||||
" <td>Real-Estate Clerk</td>\n",
|
||||
" <td>Mid senior</td>\n",
|
||||
" <td>Onsite</td>\n",
|
||||
" <td>Who We Are\\nRand Realty is a family-owned brok...</td>\n",
|
||||
" <td>3944</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>4</th>\n",
|
||||
" <td>https://www.linkedin.com/jobs/view/group-unit-...</td>\n",
|
||||
" <td>2024-01-19 09:45:09.215838+00</td>\n",
|
||||
" <td>f</td>\n",
|
||||
" <td>f</td>\n",
|
||||
" <td>f</td>\n",
|
||||
" <td>Group/Unit Supervisor (Systems Support Manager...</td>\n",
|
||||
" <td>IRS, Office of Chief Counsel</td>\n",
|
||||
" <td>Chamblee, GA</td>\n",
|
||||
" <td>2024-01-17</td>\n",
|
||||
" <td>Gadsden</td>\n",
|
||||
" <td>United States</td>\n",
|
||||
" <td>Supervisor Travel-Information Center</td>\n",
|
||||
" <td>Mid senior</td>\n",
|
||||
" <td>Onsite</td>\n",
|
||||
" <td>None</td>\n",
|
||||
" <td><NA></td>\n",
|
||||
" </tr>\n",
|
||||
" </tbody>\n",
|
||||
"</table>\n",
|
||||
"</div>"
|
||||
],
|
||||
"text/plain": [
|
||||
" job_link \\\n",
|
||||
"0 https://www.linkedin.com/jobs/view/account-exe... \n",
|
||||
"1 https://www.linkedin.com/jobs/view/registered-... \n",
|
||||
"2 https://www.linkedin.com/jobs/view/restaurant-... \n",
|
||||
"3 https://www.linkedin.com/jobs/view/independent... \n",
|
||||
"4 https://www.linkedin.com/jobs/view/group-unit-... \n",
|
||||
"\n",
|
||||
" last_processed_time got_summary got_ner is_being_worked \\\n",
|
||||
"0 2024-01-21 07:12:29.00256+00 t t f \n",
|
||||
"1 2024-01-21 07:39:58.88137+00 t t f \n",
|
||||
"2 2024-01-21 07:40:00.251126+00 t t f \n",
|
||||
"3 2024-01-21 07:40:00.308133+00 t t f \n",
|
||||
"4 2024-01-19 09:45:09.215838+00 f f f \n",
|
||||
"\n",
|
||||
" job_title \\\n",
|
||||
"0 Account Executive - Dispensing (NorCal/Norther... \n",
|
||||
"1 Registered Nurse - RN Care Manager \n",
|
||||
"2 RESTAURANT SUPERVISOR - THE FORKLIFT \n",
|
||||
"3 Independent Real Estate Agent \n",
|
||||
"4 Group/Unit Supervisor (Systems Support Manager... \n",
|
||||
"\n",
|
||||
" company job_location first_seen \\\n",
|
||||
"0 BD San Diego, CA 2024-01-15 \n",
|
||||
"1 Trinity Health MI Norton Shores, MI 2024-01-14 \n",
|
||||
"2 Wasatch Adaptive Sports Sandy, UT 2024-01-14 \n",
|
||||
"3 Howard Hanna | Rand Realty Englewood Cliffs, NJ 2024-01-16 \n",
|
||||
"4 IRS, Office of Chief Counsel Chamblee, GA 2024-01-17 \n",
|
||||
"\n",
|
||||
" search_city search_country search_position \\\n",
|
||||
"0 Coronado United States Color Maker \n",
|
||||
"1 Grand Haven United States Director Nursing Service \n",
|
||||
"2 Tooele United States Stand-In \n",
|
||||
"3 Pinehurst United States Real-Estate Clerk \n",
|
||||
"4 Gadsden United States Supervisor Travel-Information Center \n",
|
||||
"\n",
|
||||
" job_level job_type job_summary \\\n",
|
||||
"0 Mid senior Onsite Responsibilities\\nJob Description Summary\\nJob... \n",
|
||||
"1 Mid senior Onsite Employment Type:\\nFull time\\nShift:\\nDescripti... \n",
|
||||
"2 Mid senior Onsite Job Details\\nDescription\\nWhat You'll Do\\nAs a... \n",
|
||||
"3 Mid senior Onsite Who We Are\\nRand Realty is a family-owned brok... \n",
|
||||
"4 Mid senior Onsite None \n",
|
||||
"\n",
|
||||
" summary_length \n",
|
||||
"0 4602 \n",
|
||||
"1 2950 \n",
|
||||
"2 4571 \n",
|
||||
"3 3944 \n",
|
||||
"4 <NA> "
|
||||
]
|
||||
},
|
||||
"execution_count": 39,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"df_merged=pd.merge(job_postings_df, job_summary_df, how=\"left\", on=\"job_link\")\n",
|
||||
"df_merged.head()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 40,
|
||||
"id": "0160a559-2b17-40a6-ad9d-34ce746236d0",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/",
|
||||
"height": 490
|
||||
},
|
||||
"id": "0160a559-2b17-40a6-ad9d-34ce746236d0",
|
||||
"outputId": "e397c28b-a90d-42d2-8a9a-4c6260c45b38"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 33.2 ms, sys: 17.3 ms, total: 50.6 ms\n",
|
||||
"Wall time: 120 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<div>\n",
|
||||
"<style scoped>\n",
|
||||
" .dataframe tbody tr th:only-of-type {\n",
|
||||
" vertical-align: middle;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe tbody tr th {\n",
|
||||
" vertical-align: top;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe thead th {\n",
|
||||
" text-align: right;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<table border=\"1\" class=\"dataframe\">\n",
|
||||
" <thead>\n",
|
||||
" <tr style=\"text-align: right;\">\n",
|
||||
" <th></th>\n",
|
||||
" <th></th>\n",
|
||||
" <th>summary_length</th>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>company</th>\n",
|
||||
" <th>job_title</th>\n",
|
||||
" <th></th>\n",
|
||||
" </tr>\n",
|
||||
" </thead>\n",
|
||||
" <tbody>\n",
|
||||
" <tr>\n",
|
||||
" <th>ClickJobs.io</th>\n",
|
||||
" <th>Adolescent Behavioral Health Therapist - Substance Use Specialty (Entry Senior Level) Psychiatry</th>\n",
|
||||
" <td>23748.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>Mt. San Antonio College</th>\n",
|
||||
" <th>Chief, Police and Campus Safety</th>\n",
|
||||
" <td>22998.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>CareerBeacon</th>\n",
|
||||
" <th>Airside/Groundside Project Manager [Halifax International Airport Authority]</th>\n",
|
||||
" <td>22938.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>Tacoma Community College</th>\n",
|
||||
" <th>Anthropology Professor - Part-time</th>\n",
|
||||
" <td>22790.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>IRS, Office of Chief Counsel</th>\n",
|
||||
" <th>Program Analyst (12-Month Roster)</th>\n",
|
||||
" <td>22774.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>...</th>\n",
|
||||
" <th>...</th>\n",
|
||||
" <td>...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th rowspan=\"4\" valign=\"top\">鴻海精密工業股份有限公司</th>\n",
|
||||
" <th>HR Specialist - Payroll & Benefit</th>\n",
|
||||
" <td>0.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>Material Planner</th>\n",
|
||||
" <td>0.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>RFQ Specialist</th>\n",
|
||||
" <td>0.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>Supply Chain Program Manager</th>\n",
|
||||
" <td>0.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>🌟Daniel-Scott Recruitment Ltd🌟</th>\n",
|
||||
" <th>IT Manager</th>\n",
|
||||
" <td>0.0</td>\n",
|
||||
" </tr>\n",
|
||||
" </tbody>\n",
|
||||
"</table>\n",
|
||||
"<p>801276 rows × 1 columns</p>\n",
|
||||
"</div>"
|
||||
],
|
||||
"text/plain": [
|
||||
" summary_length\n",
|
||||
"company job_title \n",
|
||||
"ClickJobs.io Adolescent Behavioral Health Therapist - Substa... 23748.0\n",
|
||||
"Mt. San Antonio College Chief, Police and Campus Safety 22998.0\n",
|
||||
"CareerBeacon Airside/Groundside Project Manager [Halifax Int... 22938.0\n",
|
||||
"Tacoma Community College Anthropology Professor - Part-time 22790.0\n",
|
||||
"IRS, Office of Chief Counsel Program Analyst (12-Month Roster) 22774.0\n",
|
||||
"... ...\n",
|
||||
"鴻海精密工業股份有限公司 HR Specialist - Payroll & Benefit 0.0\n",
|
||||
" Material Planner 0.0\n",
|
||||
" RFQ Specialist 0.0\n",
|
||||
" Supply Chain Program Manager 0.0\n",
|
||||
"🌟Daniel-Scott Recruitment Ltd🌟 IT Manager 0.0\n",
|
||||
"\n",
|
||||
"[801276 rows x 1 columns]"
|
||||
]
|
||||
},
|
||||
"execution_count": 40,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"df_merged.groupby(['company',\"job_title\"]).agg({\n",
|
||||
" \"summary_length\":\"mean\"}).sort_values(by='summary_length', ascending = False).fillna(0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "IME4urGYQ3qS",
|
||||
"metadata": {
|
||||
"id": "IME4urGYQ3qS"
|
||||
},
|
||||
"source": [
|
||||
"We went down from around 5 seconds to less than a second here. This is in line with our speedups on other operations!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 41,
|
||||
"id": "adc00726-f151-41f4-8731-a1ce1f83eea2",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/",
|
||||
"height": 458
|
||||
},
|
||||
"id": "adc00726-f151-41f4-8731-a1ce1f83eea2",
|
||||
"outputId": "46423696-b167-4ffe-bb3b-9de7f3e6d668"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 13.7 ms, sys: 20.3 ms, total: 34 ms\n",
|
||||
"Wall time: 156 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<div>\n",
|
||||
"<style scoped>\n",
|
||||
" .dataframe tbody tr th:only-of-type {\n",
|
||||
" vertical-align: middle;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe tbody tr th {\n",
|
||||
" vertical-align: top;\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" .dataframe thead th {\n",
|
||||
" text-align: right;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<table border=\"1\" class=\"dataframe\">\n",
|
||||
" <thead>\n",
|
||||
" <tr style=\"text-align: right;\">\n",
|
||||
" <th></th>\n",
|
||||
" <th>job_title</th>\n",
|
||||
" <th>job_location</th>\n",
|
||||
" <th>summary_length</th>\n",
|
||||
" </tr>\n",
|
||||
" </thead>\n",
|
||||
" <tbody>\n",
|
||||
" <tr>\n",
|
||||
" <th>0</th>\n",
|
||||
" <td>🔥Nurse Manager, Patient Services - Operating Room</td>\n",
|
||||
" <td>Lake George, NY</td>\n",
|
||||
" <td>7342.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1</th>\n",
|
||||
" <td>🔥Behavioral Health RN 3 12s</td>\n",
|
||||
" <td>Glens Falls, NY</td>\n",
|
||||
" <td>2787.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>2</th>\n",
|
||||
" <td>🔥 Surgical Technologist - Evenings</td>\n",
|
||||
" <td>Lake George, NY</td>\n",
|
||||
" <td>2920.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>3</th>\n",
|
||||
" <td>🔥 Physician Practice Clinical Lead RN</td>\n",
|
||||
" <td>Saratoga Springs, NY</td>\n",
|
||||
" <td>2945.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>4</th>\n",
|
||||
" <td>🔥 Physican Practice LPN - Green</td>\n",
|
||||
" <td>Lake George, NY</td>\n",
|
||||
" <td>2969.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>...</th>\n",
|
||||
" <td>...</td>\n",
|
||||
" <td>...</td>\n",
|
||||
" <td>...</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1104106</th>\n",
|
||||
" <td>\"Attorney\" (Gov Appt/Non-Merit) Jobs</td>\n",
|
||||
" <td>Kentucky, United States</td>\n",
|
||||
" <td>2427.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1104107</th>\n",
|
||||
" <td>\"Accountant\"</td>\n",
|
||||
" <td>Shavano Park, TX</td>\n",
|
||||
" <td>1497.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1104108</th>\n",
|
||||
" <td>\"Accountant\"</td>\n",
|
||||
" <td>Basking Ridge, NJ</td>\n",
|
||||
" <td>1073.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1104109</th>\n",
|
||||
" <td>\"Accountant\"</td>\n",
|
||||
" <td>Austin, TX</td>\n",
|
||||
" <td>1993.0</td>\n",
|
||||
" </tr>\n",
|
||||
" <tr>\n",
|
||||
" <th>1104110</th>\n",
|
||||
" <td>\"A\" Softball Coach - Central Middle School</td>\n",
|
||||
" <td>East Corinth, ME</td>\n",
|
||||
" <td>718.0</td>\n",
|
||||
" </tr>\n",
|
||||
" </tbody>\n",
|
||||
"</table>\n",
|
||||
"<p>1104111 rows × 3 columns</p>\n",
|
||||
"</div>"
|
||||
],
|
||||
"text/plain": [
|
||||
" job_title \\\n",
|
||||
"0 🔥Nurse Manager, Patient Services - Operating Room \n",
|
||||
"1 🔥Behavioral Health RN 3 12s \n",
|
||||
"2 🔥 Surgical Technologist - Evenings \n",
|
||||
"3 🔥 Physician Practice Clinical Lead RN \n",
|
||||
"4 🔥 Physican Practice LPN - Green \n",
|
||||
"... ... \n",
|
||||
"1104106 \"Attorney\" (Gov Appt/Non-Merit) Jobs \n",
|
||||
"1104107 \"Accountant\" \n",
|
||||
"1104108 \"Accountant\" \n",
|
||||
"1104109 \"Accountant\" \n",
|
||||
"1104110 \"A\" Softball Coach - Central Middle School \n",
|
||||
"\n",
|
||||
" job_location summary_length \n",
|
||||
"0 Lake George, NY 7342.0 \n",
|
||||
"1 Glens Falls, NY 2787.0 \n",
|
||||
"2 Lake George, NY 2920.0 \n",
|
||||
"3 Saratoga Springs, NY 2945.0 \n",
|
||||
"4 Lake George, NY 2969.0 \n",
|
||||
"... ... ... \n",
|
||||
"1104106 Kentucky, United States 2427.0 \n",
|
||||
"1104107 Shavano Park, TX 1497.0 \n",
|
||||
"1104108 Basking Ridge, NJ 1073.0 \n",
|
||||
"1104109 Austin, TX 1993.0 \n",
|
||||
"1104110 East Corinth, ME 718.0 \n",
|
||||
"\n",
|
||||
"[1104111 rows x 3 columns]"
|
||||
]
|
||||
},
|
||||
"execution_count": 41,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"# Group by company, job_title, and month, and calculate the mean of summary_length\n",
|
||||
"grouped_df = df_merged.groupby(['job_title', 'job_location']).agg({'summary_length': 'mean'})\n",
|
||||
"\n",
|
||||
"# Reset index to sort by job_title and month\n",
|
||||
"grouped_df = grouped_df.reset_index()\n",
|
||||
"\n",
|
||||
"# Sort by job_title and month\n",
|
||||
"sorted_df = grouped_df.sort_values(by=['job_title', 'job_location','summary_length'],\n",
|
||||
" ascending=False).reset_index(drop=True).fillna(0)\n",
|
||||
"sorted_df"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "08c97b81-64c5-48fb-8fe0-d36789cf3deb",
|
||||
"metadata": {
|
||||
"id": "08c97b81-64c5-48fb-8fe0-d36789cf3deb"
|
||||
},
|
||||
"source": [
|
||||
"The acceleration is consistently 10x+ for complex aggregations and sorting that involve multiple columns."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "4560fe8f-61f9-4c23-bf43-ed6a82e5456e",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"5.182934522628784\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"end_time = time.time()\n",
|
||||
"execution_time = end_time - start_time\n",
|
||||
"print(execution_time)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9bcc719b-666a-4bc9-97d6-16f448b5c707",
|
||||
"metadata": {
|
||||
"id": "9bcc719b-666a-4bc9-97d6-16f448b5c707"
|
||||
},
|
||||
"source": [
|
||||
"# Summary\n",
|
||||
"\n",
|
||||
"With cudf.pandas, you can keep using pandas as your primary dataframe library. When things start to get a little slow, just load the `cudf.pandas` extension and enjoy the incredible speedups.\n",
|
||||
"\n",
|
||||
"To learn more about cudf.pandas, we encourage you to visit https://rapids.ai/cudf-pandas."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"accelerator": "GPU",
|
||||
"colab": {
|
||||
"gpuType": "T4",
|
||||
"provenance": []
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
File diff suppressed because one or more lines are too long
@ -1,305 +0,0 @@
|
||||
# MONAI Reasoning Model
|
||||
|
||||
> Work with a MONAI-Reasoning-CXR-3B vision-language model through Open WebUI
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Instructions](#instructions)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
## Basic idea
|
||||
|
||||
The MONAI Reasoning CXR 3B model is a **medical AI model** designed for **chest X-ray (CXR) interpretation** with reasoning capabilities. It combines imaging analysis with large-scale language modeling:
|
||||
|
||||
- **Medical focus**: Built within the MONAI framework for healthcare imaging tasks.
|
||||
- **Vision + language**: Takes CXR images as input and produces diagnostic text or reasoning outputs.
|
||||
- **Reasoning layer**: Goes beyond simple classification to explain intermediate steps (e.g., opacity → pneumonia suspicion).
|
||||
- **3B scale**: A moderately large multimodal model (~3 billion parameters).
|
||||
- **Trust and explainability**: Aims to make results more interpretable and clinically useful.
|
||||
|
||||
## What you'll accomplish
|
||||
|
||||
You'll deploy the MONAI-Reasoning-CXR-3B model, a specialized vision-language model for chest X-ray
|
||||
analysis, on an NVIDIA Spark device with Blackwell GPU architecture. By the end of this
|
||||
walkthrough, you will have a complete system running with VLLM serving the model for
|
||||
high-performance inference and Open WebUI providing an easy-to-use interface for interacting
|
||||
with the model. This setup is ideal for clinical demonstrations and research that requires
|
||||
transparent AI reasoning.
|
||||
|
||||
## What to know before starting
|
||||
|
||||
* Experience with the Linux command line and shell scripting
|
||||
* A basic understanding of Docker, including running containers and managing images
|
||||
* Familiarity with Python and using pip for package management
|
||||
* Knowledge of Large Language Models (LLMs) and how to interact with API endpoints
|
||||
* Basic understanding of NVIDIA GPU hardware and CUDA drivers
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Hardware Requirements:**
|
||||
* NVIDIA Spark device with ARM64 (AArch64) architecture
|
||||
* NVIDIA Blackwell GPU architecture
|
||||
* At least 24GB of GPU VRAM
|
||||
|
||||
**Software Requirements:**
|
||||
|
||||
* **NVIDIA Driver**: Ensure the driver is installed and the GPU is recognized
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
* **Docker Engine**: Docker must be installed and the daemon running
|
||||
```bash
|
||||
docker --version
|
||||
```
|
||||
|
||||
* **NVIDIA Container Toolkit**: Required for GPU access in containers
|
||||
```bash
|
||||
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
|
||||
```
|
||||
|
||||
* **Hugging Face CLI**: You'll need this to download the model
|
||||
```bash
|
||||
pip install -U huggingface_hub
|
||||
huggingface-cli whoami
|
||||
```
|
||||
|
||||
* **System Architecture**: Verify your system architecture for proper container selection
|
||||
```bash
|
||||
uname -m
|
||||
## Should output: aarch64 for ARM64 systems like NVIDIA Spark
|
||||
```
|
||||
|
||||
## Time & risk
|
||||
|
||||
* **Estimated time:** 20-35 minutes (not including model download)
|
||||
* **Risk level:** Low. All steps use publicly available containers and models
|
||||
* **Rollback:** The entire deployment is containerized. To roll back, you can simply stop and remove the Docker containers
|
||||
|
||||
## Instructions
|
||||
|
||||
## Step 1. Create the Project Directory
|
||||
|
||||
First, create a dedicated directory to store your model weights and configuration files. This
|
||||
keeps the project organized and provides a clean workspace.
|
||||
|
||||
```bash
|
||||
## Create the main directory
|
||||
mkdir -p ~/monai-reasoning-spark
|
||||
cd ~/monai-reasoning-spark
|
||||
|
||||
## Create a subdirectory for the model
|
||||
mkdir -p models
|
||||
```
|
||||
|
||||
## Step 2. Download the MONAI-Reasoning-CXR-3B Model
|
||||
|
||||
Use the Hugging Face CLI to download the model weights into the directory you just created.
|
||||
The model is approximately 6GB and will take several minutes to download depending on your
|
||||
internet connection.
|
||||
|
||||
```bash
|
||||
huggingface-cli download monai/monai-reasoning-cxr-3b \
|
||||
--local-dir ./models/monai-reasoning-cxr-3b \
|
||||
--local-dir-use-symlinks False
|
||||
```
|
||||
|
||||
**Verification Step:**
|
||||
```bash
|
||||
ls -la ./models/monai-reasoning-cxr-3b
|
||||
## You should see model files including config.json and model weights
|
||||
```
|
||||
|
||||
> [!IMPORTANT]
|
||||
> Currently, a custom internal VLLM container is required until the sm121 support is available in the public image. The instructions below use the internal container `******:5005/dl/dgx/vllm:main-py3.31165712-devel`.
|
||||
|
||||
## Step 3. Verify System Architecture
|
||||
|
||||
Before proceeding, confirm your system architecture is ARM64 for proper container selection
|
||||
on your NVIDIA Spark device:
|
||||
|
||||
```bash
|
||||
## Check your system architecture
|
||||
uname -m
|
||||
## Should output: aarch64 for ARM64 systems like NVIDIA Spark
|
||||
```
|
||||
|
||||
## Step 4. Create a Docker Network
|
||||
|
||||
Create a dedicated Docker bridge network to allow the VLLM and Open WebUI containers to
|
||||
communicate with each other easily and reliably.
|
||||
|
||||
```bash
|
||||
docker network create monai-net
|
||||
```
|
||||
|
||||
## Step 5. Deploy the VLLM Server
|
||||
|
||||
Launch the VLLM container with ARM64 architecture support, attaching it to the network you
|
||||
created and mounting your local model directory. This step configures the server for optimal
|
||||
performance on NVIDIA Spark hardware.
|
||||
|
||||
```bash
|
||||
## Stop and remove existing container if running
|
||||
docker stop vllm-server 2>/dev/null || true
|
||||
docker rm vllm-server 2>/dev/null || true
|
||||
|
||||
## Run the VLLM server with internal container
|
||||
docker run --rm -d \
|
||||
--name vllm-server \
|
||||
--gpus all \
|
||||
--ipc=host \
|
||||
--ulimit memlock=-1 \
|
||||
--ulimit stack=67108864 \
|
||||
--network monai-net \
|
||||
--platform linux/arm64 \
|
||||
-v ./models/monai-reasoning-cxr-3b:/model \
|
||||
-p 8000:8000 \
|
||||
******:5005/dl/dgx/vllm:main-py3.31165712-devel \
|
||||
vllm serve /model \
|
||||
--host 0.0.0.0 \
|
||||
--port 8000 \
|
||||
--dtype bfloat16 \
|
||||
--trust-remote-code \
|
||||
--gpu-memory-utilization 0.5 \
|
||||
--enforce-eager \
|
||||
--served-model-name monai-reasoning-cxr-3b
|
||||
```
|
||||
|
||||
**Wait for startup and verify:**
|
||||
```bash
|
||||
## Wait for the model to load (can take 1-2 minutes on Spark hardware)
|
||||
sleep 90
|
||||
|
||||
## Check if container is running
|
||||
docker ps
|
||||
|
||||
## Test the VLLM API
|
||||
curl http://localhost:8000/v1/models
|
||||
```
|
||||
|
||||
You should see JSON output showing the model is loaded and available.
|
||||
|
||||
## Step 6. Deploy Open WebUI
|
||||
|
||||
Launch the Open WebUI container with ARM64 architecture support for your NVIDIA Spark device.
|
||||
|
||||
```bash
|
||||
## Define custom prompt suggestions for medical X-ray analysis
|
||||
PROMPT_SUGGESTIONS='[
|
||||
{
|
||||
"title": ["Analyze X-Ray Image", "Find abnormalities and support devices"],
|
||||
"content": "Find abnormalities and support devices in the image."
|
||||
}
|
||||
]'
|
||||
|
||||
## Stop and remove existing container if running
|
||||
docker stop open-webui 2>/dev/null || true
|
||||
docker rm open-webui 2>/dev/null || true
|
||||
sleep 5
|
||||
|
||||
## Run Open WebUI with custom configuration
|
||||
docker run -d --rm \
|
||||
--name open-webui \
|
||||
--network monai-net \
|
||||
--platform linux/arm64 \
|
||||
-p 3000:8080 \
|
||||
-e WEBUI_AUTH=0 \
|
||||
-e WEBUI_NAME=monai-reasoning \
|
||||
-e ENABLE_SIGNUP=0 \
|
||||
-e ENABLE_ADMIN_CHAT_ACCESS=0 \
|
||||
-e ENABLE_VERSION_UPDATE_CHECK=0 \
|
||||
-e OPENAI_API_BASE_URL="http://vllm-server:8000/v1" \
|
||||
-e DEFAULT_PROMPT_SUGGESTIONS="$PROMPT_SUGGESTIONS" \
|
||||
ghcr.io/open-webui/open-webui:main
|
||||
```
|
||||
|
||||
**Verify deployment:**
|
||||
```bash
|
||||
## Wait for startup
|
||||
sleep 15
|
||||
|
||||
## Check both containers are running
|
||||
docker ps
|
||||
|
||||
## Test Open WebUI accessibility
|
||||
curl -f http://localhost:3000 || echo "Still starting up"
|
||||
```
|
||||
|
||||
## Step 7. Validate the Complete Deployment
|
||||
|
||||
Check that both containers are running properly and all endpoints are accessible:
|
||||
|
||||
```bash
|
||||
## Check container status
|
||||
docker ps
|
||||
## You should see both vllm-server and open-webui containers running
|
||||
|
||||
## Test the VLLM API
|
||||
curl http://localhost:8000/v1/models
|
||||
## Should return JSON with model information
|
||||
|
||||
## Test Open WebUI accessibility
|
||||
curl -f http://localhost:3000
|
||||
## Should return HTTP 200 response
|
||||
```
|
||||
|
||||
## Step 8. Configure Open WebUI
|
||||
|
||||
Configure the front-end interface to connect to your VLLM backend:
|
||||
|
||||
1. Open your web browser and navigate to **http://<YOUR_SPARK_DEVICE_IP>:3000**
|
||||
2. Since authentication is disabled, you'll have direct access to the interface
|
||||
3. The OpenAI API connection is pre-configured through environment variables
|
||||
4. Go to the main chat screen, click **"Select a model"**, and choose **monai-reasoning-cxr-3b**
|
||||
5. **Important:** Navigate to **Chat Controls** → **Advanced Params** and disable **"Reasoning Tags"** to get the full reasoning output from the model
|
||||
|
||||
You can now upload a chest X-ray image and ask questions directly in the chat interface. The custom prompt suggestion "Find abnormalities and support devices in the image" will be available for quick access.
|
||||
|
||||
## Step 9. Cleanup and Rollback
|
||||
|
||||
To stop and remove the containers and network, run the following commands. This will not
|
||||
delete your downloaded model weights.
|
||||
|
||||
> [!WARNING]
|
||||
> This will stop all running containers and remove the network.
|
||||
|
||||
```bash
|
||||
## Stop containers
|
||||
docker stop vllm-server open-webui
|
||||
|
||||
## Remove network
|
||||
docker network rm monai-net
|
||||
|
||||
## Optional: Remove model directory to free disk space
|
||||
## rm -rf ~/monai-reasoning-spark/models
|
||||
```
|
||||
|
||||
## Step 10. Next Steps
|
||||
|
||||
Your MONAI reasoning system is now ready for use. Upload chest X-ray images through the web
|
||||
interface at http://<YOUR_SPARK_DEVICE_IP>:3000 and interact with the MONAI-Reasoning-CXR-3B model
|
||||
for medical image analysis and reasoning tasks.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
| VLLM container fails to start | Insufficient GPU memory | Reduce `--gpu-memory-utilization` to 0.25 |
|
||||
| Model download fails | Network connectivity or HF auth | Check `huggingface-cli whoami` and internet |
|
||||
| Cannot access gated repo for URL | Certain HuggingFace models have restricted access | Regenerate your HuggingFace token; and request access to the gated model on your web browser |
|
||||
| Open WebUI shows connection error | Wrong backend URL | Verify `OPENAI_API_BASE_URL` is set correctly |
|
||||
| Model doesn't show full reasoning | Reasoning tags enabled | Disable "Reasoning Tags" in Chat Controls → Advanced Params |
|
||||
|
||||
> [!NOTE]
|
||||
> DGX Spark uses a Unified Memory Architecture (UMA), which enables dynamic memory sharing between the GPU and CPU.
|
||||
With many applications still updating to take advantage of UMA, you may encounter memory issues even when within
|
||||
the memory capacity of DGX Spark. If that happens, manually flush the buffer cache with:
|
||||
```bash
|
||||
sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
|
||||
```
|
||||
@ -1,415 +0,0 @@
|
||||
# Use Open Fold
|
||||
|
||||
> Use OpenFold with TensorRT optimization
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Access through terminal](#access-through-terminal)
|
||||
- [Step 7. Option B - Run locally with demo script](#step-7-option-b-run-locally-with-demo-script)
|
||||
- [Using a custom FASTA file](#using-a-custom-fasta-file)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
## What you'll accomplish
|
||||
|
||||
You'll set up a GPU-accelerated protein folding workflow on NVIDIA Spark devices using OpenFold
|
||||
with TensorRT optimization and MMseqs2-GPU. After completing this walkthrough, you'll be able to
|
||||
fold proteins in under 60 seconds using either NVIDIA's cloud UI or running locally on your
|
||||
RTX Pro 6000 or DGX Spark workstation.
|
||||
|
||||
## What to know before starting
|
||||
|
||||
- Installing Python packages via pip
|
||||
- Using Docker and the NVIDIA Container Toolkit for GPU workflows
|
||||
- Running basic Linux commands and setting environment variables
|
||||
- Understanding FASTA files and basics of protein structure workflows
|
||||
- Working with CUDA-enabled applications
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- NVIDIA GPU (RTX Pro 6000 or DGX Spark recommended)
|
||||
```bash
|
||||
nvidia-smi # Should show GPU with CUDA ≥12.9
|
||||
```
|
||||
- NVIDIA drivers and CUDA toolkit installed
|
||||
```bash
|
||||
nvcc --version # Should show CUDA 12.9 or higher
|
||||
```
|
||||
- Docker with NVIDIA Container Toolkit
|
||||
```bash
|
||||
docker run --rm --gpus all nvidia/cuda:12.9.0-base-ubuntu22.04 nvidia-smi
|
||||
```
|
||||
- Python 3.8+ environment
|
||||
```bash
|
||||
python3 --version # Should show 3.8 or higher
|
||||
```
|
||||
- Sufficient disk space for databases (>3TB recommended)
|
||||
```bash
|
||||
df -h # Check available space
|
||||
```
|
||||
|
||||
## Ancillary files
|
||||
|
||||
- OpenFold parameters (`finetuning_ptm_2.pt`) — pre-trained model weights for structure prediction
|
||||
- PDB70 database — template structures for homology modeling
|
||||
- UniRef90 database — sequence database for MSA generation
|
||||
- MGnify database — metagenomic sequences for MSA generation
|
||||
- Uniclust30 database — clustered UniProt sequences for MSA generation
|
||||
- BFD database — large sequence database for MSA generation
|
||||
- MMCIF files — template structure files in mmCIF format
|
||||
- py3Dmol package — Python library for 3D protein visualization
|
||||
|
||||
## Time & risk
|
||||
|
||||
**Duration:** Initial setup takes 2-4 hours (mainly downloading databases). Each protein fold takes
|
||||
<60 seconds on GPU vs hours on CPU.
|
||||
|
||||
**Risks:**
|
||||
- Database downloads may fail due to network interruptions
|
||||
- Insufficient disk space for full databases
|
||||
- GPU memory limitations for very large proteins (>2000 residues)
|
||||
|
||||
**Rollback:** All operations are read-only after setup. Remove downloaded databases and output
|
||||
directories to clean up.
|
||||
|
||||
## Access through terminal
|
||||
|
||||
## Step 1. Verify GPU and CUDA installation
|
||||
|
||||
Confirm your system has the required GPU and CUDA version for running OpenFold with TensorRT
|
||||
optimization.
|
||||
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
Expected output should show an NVIDIA GPU with CUDA capability ≥12.9. For DGX Spark or RTX Pro
|
||||
6000, you should see the appropriate GPU model listed.
|
||||
|
||||
```bash
|
||||
nvcc --version
|
||||
```
|
||||
|
||||
This should display CUDA compilation tools, release 12.9 or higher.
|
||||
|
||||
## Step 2. Set up Python environment
|
||||
|
||||
Create a Python virtual environment and install the required packages for protein folding and
|
||||
visualization.
|
||||
|
||||
```bash
|
||||
python3 -m venv openfold_env
|
||||
source openfold_env/bin/activate
|
||||
pip install --upgrade pip
|
||||
```
|
||||
|
||||
Install the py3Dmol visualization package:
|
||||
|
||||
```bash
|
||||
pip install py3Dmol
|
||||
```
|
||||
|
||||
## Step 3. Download OpenFold and databases
|
||||
|
||||
Download the OpenFold repository and required databases. This step requires significant disk
|
||||
space and network bandwidth.
|
||||
|
||||
> TODO: Add specific download URLs for OpenFold repository from official GitHub
|
||||
|
||||
```bash
|
||||
## Clone OpenFold repository
|
||||
git clone https://github.com/NVIDIA/dgx-spark-playbooks
|
||||
cd nvidia/protein-folding/assets
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
Download the model parameters:
|
||||
|
||||
> TODO: Add direct download URL for finetuning_ptm_2.pt
|
||||
|
||||
```bash
|
||||
mkdir -p openfold_params
|
||||
wget -O openfold_params/finetuning_ptm_2.pt <PARAM_DOWNLOAD_URL>
|
||||
```
|
||||
|
||||
## Step 4. Download sequence databases
|
||||
|
||||
Download all required databases for MSA generation. Each database serves a specific purpose in
|
||||
the folding pipeline.
|
||||
|
||||
> TODO: Add specific download URLs for each database from official sources
|
||||
|
||||
```bash
|
||||
## Create database directory
|
||||
mkdir -p databases
|
||||
cd databases
|
||||
|
||||
## Download PDB70 (for template structures)
|
||||
wget <PDB70_DOWNLOAD_URL>
|
||||
tar -xzf pdb70.tar.gz
|
||||
|
||||
## Download UniRef90 (for MSA)
|
||||
wget <UNIREF90_DOWNLOAD_URL>
|
||||
tar -xzf uniref90.tar.gz
|
||||
|
||||
## Download MGnify (metagenomic sequences)
|
||||
wget <MGNIFY_DOWNLOAD_URL>
|
||||
tar -xzf mgnify.tar.gz
|
||||
|
||||
## Download Uniclust30 (clustered sequences)
|
||||
wget <UNICLUST30_DOWNLOAD_URL>
|
||||
tar -xzf uniclust30.tar.gz
|
||||
|
||||
## Download BFD (large sequence database)
|
||||
wget <BFD_DOWNLOAD_URL>
|
||||
tar -xzf bfd.tar.gz
|
||||
|
||||
## Download MMCIF files (structure templates)
|
||||
wget <MMCIF_DOWNLOAD_URL>
|
||||
tar -xzf mmcif.tar.gz
|
||||
|
||||
cd ..
|
||||
```
|
||||
|
||||
## Step 5. Configure environment variables
|
||||
|
||||
Set up environment variables pointing to your downloaded databases and parameters.
|
||||
|
||||
```bash
|
||||
export OF_PARAM_PATH="$(pwd)/openfold_params/finetuning_ptm_2.pt"
|
||||
export OF_DB_PDB70="$(pwd)/databases/pdb70"
|
||||
export OF_DB_UNIREF90="$(pwd)/databases/uniref90"
|
||||
export OF_DB_MGNIFY="$(pwd)/databases/mgnify"
|
||||
export OF_DB_UNICLUST30="$(pwd)/databases/uniclust30"
|
||||
export OF_DB_BFD="$(pwd)/databases/bfd"
|
||||
export OF_DB_MMCIF="$(pwd)/databases/pdb_mmcif/mmcif_files"
|
||||
export OF_DB_OBSOLETE="$(pwd)/databases/pdb_mmcif/obsolete.dat"
|
||||
export OF_DEVICE="cuda:0"
|
||||
export OF_OUTDIR="openfold_out"
|
||||
export OF_JOB="demo"
|
||||
```
|
||||
|
||||
## Step 6. Option A - Use NVIDIA Build Portal (Cloud UI)
|
||||
|
||||
For quick testing without local setup, use NVIDIA's online demo interface.
|
||||
|
||||
1. Navigate to the OpenFold2 page on NVIDIA Build Portal
|
||||
> TODO: Add specific URL for NVIDIA Build Portal OpenFold2 demo
|
||||
|
||||
2. Paste your protein sequence in FASTA format
|
||||
|
||||
3. Click "Run" to execute the folding pipeline
|
||||
|
||||
4. View results in the integrated Mol* or py3Dmol viewer
|
||||
|
||||
### Step 7. Option B - Run locally with demo script
|
||||
|
||||
Create and run the OpenFold demo script for local execution on your DGX Spark or RTX Pro 6000.
|
||||
|
||||
Create the demo script file:
|
||||
|
||||
```bash
|
||||
cat > openfold_demo.py << 'EOF'
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Single-file OpenFold runner + py3Dmol viewer.
|
||||
"""
|
||||
import os, subprocess as sp, glob, sys, tempfile, textwrap
|
||||
|
||||
## Paths (edit for your system)
|
||||
PARAM = os.getenv("OF_PARAM_PATH", "/path/to/openfold_params/finetuning_ptm_2.pt")
|
||||
PDB70 = os.getenv("OF_DB_PDB70", "/path/to/pdb70")
|
||||
UNIREF90 = os.getenv("OF_DB_UNIREF90", "/path/to/uniref90")
|
||||
MGNIFY = os.getenv("OF_DB_MGNIFY", "/path/to/mgnify")
|
||||
UNICLUST30 = os.getenv("OF_DB_UNICLUST30", "/path/to/uniclust30")
|
||||
BFD = os.getenv("OF_DB_BFD", "/path/to/bfd")
|
||||
MMCIF = os.getenv("OF_DB_MMCIF", "/path/to/pdb_mmcif/mmcif_files")
|
||||
OBSOLETE = os.getenv("OF_DB_OBSOLETE", "/path/to/pdb_mmcif/obsolete.dat")
|
||||
DEVICE = os.getenv("OF_DEVICE", "cuda:0")
|
||||
OUTDIR = os.getenv("OF_OUTDIR", "openfold_out")
|
||||
JOB = os.getenv("OF_JOB", "demo")
|
||||
|
||||
SEQ = """>demo
|
||||
MGSDKIHHHHHHENLYFQGAMASMTGGQQMGRGSMAAAAKKVVAGAAAAGGQAGD"""
|
||||
|
||||
def ensure_py3dmol():
|
||||
try:
|
||||
import py3Dmol
|
||||
except ImportError:
|
||||
sp.check_call([sys.executable, "-m", "pip", "install", "py3Dmol"])
|
||||
|
||||
def run_openfold(fasta_path):
|
||||
cmd = [
|
||||
sys.executable, "openfold/run_pretrained_openfold.py",
|
||||
"--fasta_path", fasta_path,
|
||||
"--job_name", JOB,
|
||||
"--output_dir", OUTDIR,
|
||||
"--model_device", DEVICE,
|
||||
"--param_path", PARAM,
|
||||
"--pdb70_database_path", PDB70,
|
||||
"--uniref90_database_path", UNIREF90,
|
||||
"--mgnify_database_path", MGNIFY,
|
||||
"--uniclust30_database_path", UNICLUST30,
|
||||
"--bfd_database_path", BFD,
|
||||
"--template_mmcif_dir", MMCIF,
|
||||
"--obsolete_pdbs_path", OBSOLETE,
|
||||
"--skip_relaxation"
|
||||
]
|
||||
sp.check_call(cmd)
|
||||
|
||||
def visualize():
|
||||
import py3Dmol
|
||||
pdb = open(f"{OUTDIR}/{JOB}/ranked_0.pdb").read()
|
||||
view = py3Dmol.view(width=800, height=520)
|
||||
view.addModel(pdb, "pdb")
|
||||
view.setStyle({"cartoon": {"arrows": True}})
|
||||
view.zoomTo()
|
||||
open(f"{OUTDIR}/{JOB}_view.html", "w").write(view._make_html())
|
||||
print(f"Viewer written to {OUTDIR}/{JOB}_view.html")
|
||||
|
||||
def main():
|
||||
ensure_py3dmol()
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
fasta_path = os.path.join(td, f"{JOB}.fasta")
|
||||
open(fasta_path, "w").write(textwrap.dedent(SEQ).strip() + "\n")
|
||||
run_openfold(fasta_path)
|
||||
visualize()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
EOF
|
||||
```
|
||||
|
||||
Make the script executable and run it:
|
||||
|
||||
```bash
|
||||
chmod +x openfold_demo.py
|
||||
python openfold_demo.py
|
||||
```
|
||||
|
||||
## Step 8. Validate the output
|
||||
|
||||
Check that the folding completed successfully and view the generated structure.
|
||||
|
||||
```bash
|
||||
## Verify PDB file was created
|
||||
ls -la openfold_out/demo/ranked_0.pdb
|
||||
```
|
||||
|
||||
The file should exist and be non-empty (typically >10KB for a small protein).
|
||||
|
||||
```bash
|
||||
## Check the HTML viewer was generated
|
||||
ls -la openfold_out/demo_view.html
|
||||
```
|
||||
|
||||
Open the HTML file in a web browser to visualize the folded protein structure:
|
||||
|
||||
```bash
|
||||
## On Linux with GUI
|
||||
xdg-open openfold_out/demo_view.html
|
||||
|
||||
## Or copy the full path and open in browser manually
|
||||
realpath openfold_out/demo_view.html
|
||||
```
|
||||
|
||||
## Step 9. Run with custom sequences
|
||||
|
||||
To fold your own protein sequences, modify the demo script or create a new FASTA file.
|
||||
|
||||
### Using a custom FASTA file
|
||||
|
||||
```bash
|
||||
## Create your FASTA file
|
||||
cat > my_protein.fasta << 'EOF'
|
||||
>my_protein
|
||||
MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLPARTVETRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPGCMNCKCVIS
|
||||
EOF
|
||||
|
||||
## Run OpenFold directly
|
||||
python openfold/run_pretrained_openfold.py \
|
||||
--fasta_path my_protein.fasta \
|
||||
--job_name my_protein \
|
||||
--output_dir openfold_out \
|
||||
--model_device cuda:0 \
|
||||
--param_path $OF_PARAM_PATH \
|
||||
--pdb70_database_path $OF_DB_PDB70 \
|
||||
--uniref90_database_path $OF_DB_UNIREF90 \
|
||||
--mgnify_database_path $OF_DB_MGNIFY \
|
||||
--uniclust30_database_path $OF_DB_UNICLUST30 \
|
||||
--bfd_database_path $OF_DB_BFD \
|
||||
--template_mmcif_dir $OF_DB_MMCIF \
|
||||
--obsolete_pdbs_path $OF_DB_OBSOLETE \
|
||||
--skip_relaxation
|
||||
```
|
||||
|
||||
## Step 10. Troubleshooting common issues
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
| CUDA out of memory error | Protein too large for GPU | Reduce max_template_date or use smaller sequence |
|
||||
| Database file not found | Incomplete download or wrong path | Verify all databases downloaded and paths in env vars |
|
||||
| ImportError: No module named 'openfold' | OpenFold not installed | Run `pip install -e .` in openfold directory |
|
||||
| nvidia-smi command not found | NVIDIA drivers not installed | Install NVIDIA drivers for your GPU |
|
||||
| Folding takes hours instead of minutes | Running on CPU instead of GPU | Check OF_DEVICE="cuda:0" and GPU availability |
|
||||
| py3Dmol viewer shows blank page | JavaScript blocked or path issue | Use absolute path to HTML file or check browser console |
|
||||
|
||||
## Step 11. Cleanup and rollback
|
||||
|
||||
Remove generated outputs and optionally remove downloaded databases.
|
||||
|
||||
```bash
|
||||
## Remove output files only (safe)
|
||||
rm -rf openfold_out/
|
||||
|
||||
## Remove virtual environment (reversible)
|
||||
deactivate
|
||||
rm -rf openfold_env/
|
||||
```
|
||||
|
||||
> [!WARNING]
|
||||
> The following will delete downloaded databases (>3TB). Only run if you need to free disk space and are willing to re-download.
|
||||
|
||||
```bash
|
||||
## Remove all databases (requires re-download)
|
||||
rm -rf databases/
|
||||
|
||||
## Remove OpenFold repository
|
||||
rm -rf openfold/
|
||||
```
|
||||
|
||||
## Step 12. Next steps
|
||||
|
||||
Test the installation with a well-known protein structure to verify accuracy:
|
||||
|
||||
```bash
|
||||
## Test with ubiquitin (PDB: 1UBQ)
|
||||
cat > test_ubiquitin.fasta << 'EOF'
|
||||
>1UBQ
|
||||
MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG
|
||||
EOF
|
||||
|
||||
python openfold/run_pretrained_openfold.py \
|
||||
--fasta_path test_ubiquitin.fasta \
|
||||
--job_name ubiquitin_test \
|
||||
--output_dir openfold_out \
|
||||
--model_device cuda:0 \
|
||||
--param_path $OF_PARAM_PATH \
|
||||
--pdb70_database_path $OF_DB_PDB70 \
|
||||
--uniref90_database_path $OF_DB_UNIREF90 \
|
||||
--mgnify_database_path $OF_DB_MGNIFY \
|
||||
--uniclust30_database_path $OF_DB_UNICLUST30 \
|
||||
--bfd_database_path $OF_DB_BFD \
|
||||
--template_mmcif_dir $OF_DB_MMCIF \
|
||||
--obsolete_pdbs_path $OF_DB_OBSOLETE \
|
||||
--skip_relaxation
|
||||
```
|
||||
|
||||
For production use, consider:
|
||||
- Enabling structure relaxation for higher accuracy (remove `--skip_relaxation`)
|
||||
- Setting up batch processing for multiple sequences
|
||||
- Integrating with drug discovery pipelines
|
||||
- Scaling to full proteomes using DGX Spark clusters
|
||||
@ -1,227 +0,0 @@
|
||||
# SGLang Inference Server
|
||||
|
||||
> Install and use SGLang on DGX Spark
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Time & risk](#time-risk)
|
||||
- [Instructions](#instructions)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
## Basic Idea
|
||||
|
||||
SGLang is a fast serving framework for large language models and vision language models that makes
|
||||
your interaction with models faster and more controllable by co-designing the backend runtime and
|
||||
frontend language. This setup uses the optimized NVIDIA SGLang NGC Container on a single NVIDIA
|
||||
Spark device with Blackwell architecture, providing GPU-accelerated inference with all dependencies
|
||||
pre-installed.
|
||||
|
||||
## What you'll accomplish
|
||||
|
||||
You'll deploy SGLang in both server and offline inference modes on your NVIDIA Spark device,
|
||||
enabling high-performance LLM serving with support for text generation, chat completion, and
|
||||
vision-language tasks using models like DeepSeek-V2-Lite.
|
||||
|
||||
## What to know before starting
|
||||
|
||||
- Working in a terminal environment on Linux systems
|
||||
- Basic understanding of Docker containers and container management
|
||||
- Familiarity with NVIDIA GPU drivers and CUDA toolkit concepts
|
||||
- Experience with HTTP API endpoints and JSON request/response handling
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- NVIDIA Spark device with Blackwell architecture
|
||||
- Docker Engine installed and running: `docker --version`
|
||||
- NVIDIA GPU drivers installed: `nvidia-smi`
|
||||
- NVIDIA Container Toolkit configured: `docker run --rm --gpus all nvidia/cuda:12.9-base nvidia-smi`
|
||||
- Sufficient disk space (>20GB available): `df -h`
|
||||
- Network connectivity for pulling NGC containers: `ping nvcr.io`
|
||||
|
||||
## Ancillary files
|
||||
|
||||
- An offline inference python script [found here on GitHub](https://github.com/NVIDIA/dgx-spark-playbooks/blob/main/nvidia/sglang/assets/offline-inference.py)
|
||||
|
||||
### Time & risk
|
||||
|
||||
**Duration:** 15-30 minutes for initial setup and validation
|
||||
|
||||
**Risk level:** Low - Uses pre-built, validated NGC container with minimal configuration
|
||||
|
||||
**Rollback:** Stop and remove containers with `docker stop` and `docker rm` commands
|
||||
|
||||
## Instructions
|
||||
|
||||
## Step 1. Verify system prerequisites
|
||||
|
||||
Check that your NVIDIA Spark device meets all requirements before proceeding. This step runs on
|
||||
your host system and ensures Docker, GPU drivers, and container toolkit are properly configured.
|
||||
|
||||
```bash
|
||||
## Verify Docker installation
|
||||
docker --version
|
||||
|
||||
## Check NVIDIA GPU drivers
|
||||
nvidia-smi
|
||||
|
||||
## Test NVIDIA Container Toolkit
|
||||
docker run --rm --gpus all nvidia/cuda:12.9-base-ubuntu20.04 nvidia-smi
|
||||
|
||||
## Check available disk space
|
||||
df -h /
|
||||
```
|
||||
|
||||
## Step 2. Pull the SGLang NGC Container
|
||||
|
||||
Download the latest SGLang container from NVIDIA NGC. This step runs on the host and may take
|
||||
several minutes depending on your network connection.
|
||||
|
||||
> TODO: Verify the exact container tag/version for SGLang NGC container
|
||||
|
||||
```bash
|
||||
## Pull the SGLang container
|
||||
docker pull nvcr.io/nvidia/sglang:<VERSION>-py3
|
||||
|
||||
## Verify the image was downloaded
|
||||
docker images | grep sglang
|
||||
```
|
||||
|
||||
## Step 3. Launch SGLang container for server mode
|
||||
|
||||
Start the SGLang container in server mode to enable HTTP API access. This runs the inference
|
||||
server inside the container, exposing it on port 30000 for client connections.
|
||||
|
||||
```bash
|
||||
## Launch container with GPU support and port mapping
|
||||
docker run --gpus all -it --rm \
|
||||
-p 30000:30000 \
|
||||
-v /tmp:/tmp \
|
||||
nvcr.io/nvidia/sglang:<VERSION>-py3 \
|
||||
bash
|
||||
```
|
||||
|
||||
## Step 4. Start the SGLang inference server
|
||||
|
||||
Inside the container, launch the HTTP inference server with a supported model. This step runs
|
||||
inside the Docker container and starts the SGLang server daemon.
|
||||
|
||||
```bash
|
||||
## Start the inference server with DeepSeek-V2-Lite model
|
||||
python3 -m sglang.launch_server \
|
||||
--model-path deepseek-ai/DeepSeek-V2-Lite \
|
||||
--host 0.0.0.0 \
|
||||
--port 30000 \
|
||||
--trust-remote-code \
|
||||
--tp 1 &
|
||||
|
||||
## Wait for server to initialize
|
||||
sleep 30
|
||||
|
||||
## Check server status
|
||||
curl http://localhost:30000/health
|
||||
```
|
||||
|
||||
## Step 5. Test client-server inference
|
||||
|
||||
From a new terminal on your host system, test the SGLang server API to ensure it's working
|
||||
correctly. This validates that the server is accepting requests and generating responses.
|
||||
|
||||
```bash
|
||||
## Test with curl
|
||||
curl -X POST http://localhost:30000/generate \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"text": "What does NVIDIA love?",
|
||||
"sampling_params": {
|
||||
"temperature": 0.7,
|
||||
"max_new_tokens": 100
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
## Step 6. Test Python client API
|
||||
|
||||
Create a simple Python script to test programmatic access to the SGLang server. This runs on
|
||||
the host system and demonstrates how to integrate SGLang into applications.
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
## Send prompt to server
|
||||
response = requests.post('http://localhost:30000/generate', json={
|
||||
'text': 'What does NVIDIA love?',
|
||||
'sampling_params': {
|
||||
'temperature': 0.7,
|
||||
'max_new_tokens': 100,
|
||||
},
|
||||
})
|
||||
|
||||
print(f"Response: {response.json()['text']}")
|
||||
```
|
||||
|
||||
## Step 7. Test offline inference mode
|
||||
|
||||
Launch a new container instance for offline inference to demonstrate local model usage without
|
||||
HTTP server. This runs entirely within the container for batch processing scenarios.
|
||||
|
||||
TO DO: NEEDS TO HAVE SCRIPT FROM ASSETS PROPERLY INCORPORATED. [See here](https://github.com/NVIDIA/dgx-spark-playbooks/blob/main/nvidia/sglang/assets)
|
||||
|
||||
## Step 8. Validate installation
|
||||
|
||||
Confirm that both server and offline modes are working correctly. This step verifies the
|
||||
complete SGLang setup and ensures reliable operation.
|
||||
|
||||
```bash
|
||||
## Check server mode (from host)
|
||||
curl http://localhost:30000/health
|
||||
curl -X POST http://localhost:30000/generate -H "Content-Type: application/json" \
|
||||
-d '{"text": "Hello", "sampling_params": {"max_new_tokens": 10}}'
|
||||
|
||||
## Check container logs
|
||||
docker ps
|
||||
docker logs <CONTAINER_ID>
|
||||
```
|
||||
|
||||
## Step 9. Troubleshooting
|
||||
|
||||
Common issues and their resolutions:
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
| Container fails to start with GPU errors | NVIDIA drivers/toolkit missing | Install nvidia-container-toolkit, restart Docker |
|
||||
| Server responds with 404 or connection refused | Server not fully initialized | Wait 60 seconds, check container logs |
|
||||
| Out of memory errors during model loading | Insufficient GPU memory | Use smaller model or increase --tp parameter |
|
||||
| Model download fails | Network connectivity issues | Check internet connection, retry download |
|
||||
| Permission denied accessing /tmp | Volume mount issues | Use full path: -v /tmp:/tmp or create dedicated directory |
|
||||
|
||||
## Step 10. Cleanup and rollback
|
||||
|
||||
Stop and remove containers to clean up resources. This step returns your system to its
|
||||
original state.
|
||||
|
||||
> [!WARNING]
|
||||
> This will stop all SGLang containers and remove temporary data.
|
||||
|
||||
```bash
|
||||
## Stop all SGLang containers
|
||||
docker ps | grep sglang | awk '{print $1}' | xargs docker stop
|
||||
|
||||
## Remove stopped containers
|
||||
docker container prune -f
|
||||
|
||||
## Remove SGLang images (optional)
|
||||
docker rmi nvcr.io/nvidia/sglang:<VERSION>-py3
|
||||
```
|
||||
|
||||
## Step 11. Next steps
|
||||
|
||||
With SGLang successfully deployed, you can now:
|
||||
|
||||
- Integrate the HTTP API into your applications using the `/generate` endpoint
|
||||
- Experiment with different models by changing the `--model-path` parameter
|
||||
- Scale up using multiple GPUs by adjusting the `--tp` (tensor parallel) setting
|
||||
- Deploy production workloads using the container orchestration platform of your choice
|
||||
@ -1,30 +0,0 @@
|
||||
#
|
||||
# SPDX-FileCopyrightText: Copyright (c) 1993-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import sglang as sgl
|
||||
|
||||
def main():
|
||||
llm = sgl.Engine(model_path="deepseek-ai/DeepSeek-V2-Lite", trust_remote_code=True)
|
||||
|
||||
prompt = "What does NVIDIA love?"
|
||||
sampling_params = {"temperature": 0.7, "max_new_tokens": 100}
|
||||
|
||||
output = llm.generate(prompt, sampling_params)
|
||||
print(f"Output: {output}")
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@ -1,153 +0,0 @@
|
||||
# Vibe Coding in VS Code
|
||||
|
||||
> Use DGX Spark as a local or remote Vibe Coding assistant with Ollama and Continue.dev
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [What You'll Accomplish](#what-youll-accomplish)
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Requirements](#requirements)
|
||||
- [Instructions](#instructions)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
## DGX Spark Vibe Coding
|
||||
|
||||
This playbook walks you through setting up DGX Spark as a **Vibe Coding assistant** — locally or as a remote coding companion for VSCode with Continue.dev.
|
||||
While NVIDIA NIMs are not yet widely supported, this guide uses **Ollama** with **GPT-OSS 120B** to provide a high-performance local LLM environment.
|
||||
|
||||
### What You'll Accomplish
|
||||
|
||||
You'll have a fully configured DGX Spark system capable of:
|
||||
- Running local code assistance through Ollama.
|
||||
- Serving models remotely for Continue.dev and VSCode integration.
|
||||
- Hosting large LLMs like GPT-OSS 120B using unified memory.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- DGX Spark (128GB unified memory recommended)
|
||||
- Internet access for model downloads
|
||||
- Basic familiarity with the terminal
|
||||
- Optional: firewall control for remote access configuration
|
||||
|
||||
### Requirements
|
||||
|
||||
- **Ollama** and an LLM of your choice (e.g., `gpt-oss:120b`)
|
||||
- **VSCode**
|
||||
- **Continue.dev** VSCode extension
|
||||
|
||||
## Instructions
|
||||
|
||||
## Step 1. Install Ollama
|
||||
|
||||
Install the latest version of Ollama using the following command:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
```
|
||||
|
||||
Start the Ollama service:
|
||||
|
||||
```bash
|
||||
ollama serve
|
||||
```
|
||||
|
||||
Once the service is running, pull the desired model:
|
||||
|
||||
```bash
|
||||
ollama pull gpt-oss:120b
|
||||
```
|
||||
|
||||
## Step 2. (Optional) Enable Remote Access
|
||||
|
||||
To allow remote connections (e.g., from a workstation using VSCode and Continue.dev), modify the Ollama systemd service:
|
||||
|
||||
```bash
|
||||
sudo systemctl edit ollama
|
||||
```
|
||||
|
||||
Add the following lines beneath the commented section:
|
||||
|
||||
```ini
|
||||
[Service]
|
||||
Environment="OLLAMA_HOST=0.0.0.0:11434"
|
||||
Environment="OLLAMA_ORIGINS=*"
|
||||
```
|
||||
|
||||
Reload and restart the service:
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl restart ollama
|
||||
```
|
||||
|
||||
If using a firewall, open port 11434:
|
||||
|
||||
```bash
|
||||
sudo ufw allow 11434/tcp
|
||||
```
|
||||
|
||||
## Step 3. Install VSCode
|
||||
|
||||
For DGX Spark (ARM-based), download and install VSCode:
|
||||
|
||||
```bash
|
||||
wget https://code.visualstudio.com/sha/download?build=stable&os=linux-deb-arm64 -O vscode-arm64.deb
|
||||
sudo apt install ./vscode-arm64.deb
|
||||
```
|
||||
|
||||
If using a remote workstation, install VSCode appropriate for your system architecture.
|
||||
|
||||
## Step 4. Install Continue.dev Extension
|
||||
|
||||
Open VSCode and install **Continue.dev** from the Marketplace.
|
||||
After installation, click the Continue icon on the right-hand bar.
|
||||
|
||||
Skip login and open the manual configuration via the **gear (⚙️)** icon.
|
||||
This opens `config.yaml`, which controls model settings.
|
||||
|
||||
## Step 5. Local Inference Setup
|
||||
|
||||
- In the Continue chat window, use `Ctrl/Cmd + L` to focus the chat.
|
||||
- Click **Select Model → + Add Chat Model**
|
||||
- Choose **Ollama** as the provider.
|
||||
- Set **Install Provider** to default.
|
||||
- For **Model**, select **Autodetect**.
|
||||
- Click **Connect**.
|
||||
|
||||
You can now select your downloaded model (e.g., `gpt-oss:120b`) for local inference.
|
||||
|
||||
## Step 6. Remote Setup for DGX Spark
|
||||
|
||||
To connect Continue.dev to a remote DGX Spark instance, edit `config.yaml` in Continue and add:
|
||||
|
||||
```yaml
|
||||
models:
|
||||
- model: gpt-oss:120b
|
||||
title: gpt-oss:120b
|
||||
apiBase: http://YOUR_SPARK_IP:11434/
|
||||
provider: ollama
|
||||
```
|
||||
|
||||
Replace `YOUR_SPARK_IP` with the IP address of your DGX Spark.
|
||||
Add additional model entries for any other Ollama models you wish to host remotely.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
## Common Issues
|
||||
|
||||
**1. Ollama not starting**
|
||||
- Verify Docker and GPU drivers are installed correctly.
|
||||
- Run `ollama serve` manually to view errors.
|
||||
|
||||
**2. VSCode can't connect**
|
||||
- Ensure port 11434 is open and accessible from your workstation.
|
||||
- Check `OLLAMA_HOST` and `OLLAMA_ORIGINS` in `/etc/systemd/system/ollama.service.d/override.conf`.
|
||||
|
||||
**3. High memory usage**
|
||||
- Use smaller models such as `gpt-oss:20b` for lightweight usage.
|
||||
- Confirm no other large models or containers are running with `nvidia-smi`.
|
||||
Loading…
Reference in New Issue
Block a user