From f7f0a5ec85283b9535cc5c21a1833361a4aa2e4f Mon Sep 17 00:00:00 2001 From: GitLab CI Date: Fri, 10 Oct 2025 17:30:35 +0000 Subject: [PATCH] chore: Regenerate all playbooks --- README.md | 3 +- nvidia/stack-sparks/README.md | 2 +- nvidia/vibe-coding/README.md | 153 ++++++++++++++++++++++++++++++++++ 3 files changed, 156 insertions(+), 2 deletions(-) create mode 100644 nvidia/vibe-coding/README.md diff --git a/README.md b/README.md index ddff0e1..16d037c 100644 --- a/README.md +++ b/README.md @@ -42,11 +42,12 @@ Each playbook includes prerequisites, step-by-step instructions, troubleshooting - [RAG application in AI Workbench](nvidia/rag-ai-workbench/) - [SGLang Inference Server](nvidia/sglang/) - [Speculative Decoding](nvidia/speculative-decoding/) -- [Stack two Sparks](nvidia/stack-sparks/) +- [Connect two Sparks](nvidia/stack-sparks/) - [Set up Tailscale on your Spark](nvidia/tailscale/) - [TRT LLM for Inference](nvidia/trt-llm/) - [Text to Knowledge Graph](nvidia/txt2kg/) - [Unsloth on DGX Spark](nvidia/unsloth/) +- [Vibe Coding in VS Code](nvidia/vibe-coding/) - [Install and use vLLM](nvidia/vllm/) - [Vision-Language Model Fine-tuning](nvidia/vlm-finetuning/) - [Install VS Code](nvidia/vscode/) diff --git a/nvidia/stack-sparks/README.md b/nvidia/stack-sparks/README.md index d3b8bcc..2b2213b 100644 --- a/nvidia/stack-sparks/README.md +++ b/nvidia/stack-sparks/README.md @@ -1,4 +1,4 @@ -# Stack two Sparks +# Connect two Sparks > Connect two Spark devices and setup them up for inference and fine-tuning diff --git a/nvidia/vibe-coding/README.md b/nvidia/vibe-coding/README.md new file mode 100644 index 0000000..67a7627 --- /dev/null +++ b/nvidia/vibe-coding/README.md @@ -0,0 +1,153 @@ +# Vibe Coding in VS Code + +> Use DGX Spark as a local or remote Vibe Coding assistant with Ollama and Continue.dev + +## Table of Contents + +- [Overview](#overview) + - [What You'll Accomplish](#what-youll-accomplish) + - [Prerequisites](#prerequisites) + - [Requirements](#requirements) +- [Instructions](#instructions) +- [Troubleshooting](#troubleshooting) + +--- + +## Overview + +## DGX Spark Vibe Coding + +This playbook walks you through setting up DGX Spark as a **Vibe Coding assistant** — locally or as a remote coding companion for VSCode with Continue.dev. +While NVIDIA NIMs are not yet widely supported, this guide uses **Ollama** with **GPT-OSS 120B** to provide a high-performance local LLM environment. + +### What You'll Accomplish + +You’ll have a fully configured DGX Spark system capable of: +- Running local code assistance through Ollama. +- Serving models remotely for Continue.dev and VSCode integration. +- Hosting large LLMs like GPT-OSS 120B using unified memory. + +### Prerequisites + +- DGX Spark (128GB unified memory recommended) +- Internet access for model downloads +- Basic familiarity with the terminal +- Optional: firewall control for remote access configuration + +### Requirements + +- **Ollama** and an LLM of your choice (e.g., `gpt-oss:120b`) +- **VSCode** +- **Continue.dev** VSCode extension + +## Instructions + +## Step 1. Install Ollama + +Install the latest version of Ollama using the following command: + +```bash +curl -fsSL https://ollama.com/install.sh | sh +``` + +Start the Ollama service: + +```bash +ollama serve +``` + +Once the service is running, pull the desired model: + +```bash +ollama pull gpt-oss:120b +``` + +## Step 2. (Optional) Enable Remote Access + +To allow remote connections (e.g., from a workstation using VSCode and Continue.dev), modify the Ollama systemd service: + +```bash +sudo systemctl edit ollama +``` + +Add the following lines beneath the commented section: + +```ini +[Service] +Environment="OLLAMA_HOST=0.0.0.0:11434" +Environment="OLLAMA_ORIGINS=*" +``` + +Reload and restart the service: + +```bash +sudo systemctl daemon-reload +sudo systemctl restart ollama +``` + +If using a firewall, open port 11434: + +```bash +sudo ufw allow 11434/tcp +``` + +## Step 3. Install VSCode + +For DGX Spark (ARM-based), download and install VSCode: + +```bash +wget https://code.visualstudio.com/sha/download?build=stable&os=linux-deb-arm64 -O vscode-arm64.deb +sudo apt install ./vscode-arm64.deb +``` + +If using a remote workstation, install VSCode appropriate for your system architecture. + +## Step 4. Install Continue.dev Extension + +Open VSCode and install **Continue.dev** from the Marketplace. +After installation, click the Continue icon on the right-hand bar. + +Skip login and open the manual configuration via the **gear (⚙️)** icon. +This opens `config.yaml`, which controls model settings. + +## Step 5. Local Inference Setup + +- In the Continue chat window, use `Ctrl/Cmd + L` to focus the chat. +- Click **Select Model → + Add Chat Model** +- Choose **Ollama** as the provider. +- Set **Install Provider** to default. +- For **Model**, select **Autodetect**. +- Click **Connect**. + +You can now select your downloaded model (e.g., `gpt-oss:120b`) for local inference. + +## Step 6. Remote Setup for DGX Spark + +To connect Continue.dev to a remote DGX Spark instance, edit `config.yaml` in Continue and add: + +```yaml +models: + - model: gpt-oss:120b + title: gpt-oss:120b + apiBase: http://YOUR_SPARK_IP:11434/ + provider: ollama +``` + +Replace `YOUR_SPARK_IP` with the IP address of your DGX Spark. +Add additional model entries for any other Ollama models you wish to host remotely. + +## Troubleshooting + +## Common Issues + +**1. Ollama not starting** +- Verify Docker and GPU drivers are installed correctly. +- Run `ollama serve` manually to view errors. + +**2. VSCode can’t connect** +- Ensure port 11434 is open and accessible from your workstation. +- Check `OLLAMA_HOST` and `OLLAMA_ORIGINS` in `/etc/systemd/system/ollama.service.d/override.conf`. + +**3. High memory usage** +- Use smaller models such as `gpt-oss:20b` for lightweight usage. +- Confirm no other large models or containers are running with `nvidia-smi`.