dgx-spark-playbooks/nvidia/rag-ai-workbench/README.md

188 lines
7.1 KiB
Markdown
Raw Normal View History

2025-10-28 14:35:31 +00:00
# RAG Application in AI Workbench
2025-10-03 20:46:11 +00:00
> Install and use AI Workbench to clone and run a reproducible RAG application
## Table of Contents
- [Overview](#overview)
- [Instructions](#instructions)
2025-10-10 00:11:49 +00:00
- [Troubleshooting](#troubleshooting)
2025-10-03 20:46:11 +00:00
---
## Overview
2025-10-05 18:20:38 +00:00
## Basic idea
2025-10-03 20:46:11 +00:00
This walkthrough demonstrates how to set up and run an agentic retrieval-augmented generation (RAG)
project using NVIDIA AI Workbench. You'll use AI Workbench to clone and run a pre-built agentic RAG
application that intelligently routes queries, evaluates responses for relevancy and hallucination, and
iterates through evaluation and generation cycles. The project uses a Gradio web interface and can work
with both NVIDIA-hosted API endpoints or self-hosted models.
## What you'll accomplish
You'll have a fully functional agentic RAG application running in NVIDIA AI Workbench with a web
interface where you can submit queries and receive intelligent responses. The system will demonstrate
advanced RAG capabilities including query routing, response evaluation, and iterative refinement,
giving you hands-on experience with both AI Workbench's development environment and sophisticated RAG
architectures.
## What to know before starting
- Basic familiarity with retrieval-augmented generation (RAG) concepts
- Understanding of API keys and how to generate them
- Comfort working with web applications and browser interfaces
- Basic understanding of containerized development environments
## Prerequisites
2025-11-25 15:18:51 +00:00
**Hardware Requirements:**
- NVIDIA Grace Blackwell GB10 Superchip System
**Software Requirements:**
- NVIDIA AI Workbench installed or ready to install
2025-10-05 18:20:38 +00:00
- Free NVIDIA API key: Generate at [NGC API Keys](https://org.ngc.nvidia.com/setup/api-keys)
- Free Tavily API key: Generate at [Tavily](https://tavily.com/)
- Internet connection for cloning repositories and accessing APIs
- Web browser for accessing the Gradio interface
2025-10-03 20:46:11 +00:00
2025-10-05 18:20:38 +00:00
## Verification commands
2025-10-03 20:46:11 +00:00
2025-10-05 18:20:38 +00:00
- Verify the NVIDIA AI Workbench application exists on your DGX Spark system
- Verify your API keys are valid and up-to-date
2025-10-03 20:46:11 +00:00
## Time & risk
2025-10-08 22:00:07 +00:00
* **Estimated time:** 30-45 minutes (including AI Workbench installation if needed)
* **Risk level:** Low - Uses pre-built containers and established APIs
* **Rollback:** Simply delete the cloned project from AI Workbench to remove all components. No system changes are made outside the AI Workbench environment.
2025-11-25 15:18:51 +00:00
* **Last Updated:** 11/21/2025
* Minor copyedits
2025-10-03 20:46:11 +00:00
## Instructions
## Step 1. Install NVIDIA AI Workbench
2025-11-25 15:18:51 +00:00
Install AI Workbench on your DGX Spark system and complete the initial setup wizard.
2025-10-03 20:46:11 +00:00
2025-11-25 15:18:51 +00:00
On your DGX Spark, open the **NVIDIA AI Workbench** application and click "Begin Installation".
2025-10-03 20:46:11 +00:00
1. The installation wizard will prompt for authentication
2. Wait for the automated install to complete (several minutes)
3. Click "Let's Get Started" when installation finishes
2025-11-25 15:18:51 +00:00
> [!NOTE]
> If you encounter the following error message, reboot your DGX Spark and then reopen NVIDIA AI Workbench:
> "An error occurred ... container tool failed to reach ready state. try again: docker is not running"
2025-10-03 20:46:11 +00:00
## Step 2. Verify API key requirements
2025-11-25 15:18:51 +00:00
Next, you should ensure you have both required API keys before proceeding with the project setup. Keep these keys safe!
2025-10-03 20:46:11 +00:00
* Tavily API Key: https://tavily.com/
* NVIDIA API Key: https://org.ngc.nvidia.com/setup/api-keys
2025-10-05 19:47:05 +00:00
* Ensure this key has ``Public API Endpoints`` permissions
2025-10-03 20:46:11 +00:00
Keep both keys available for the next step.
## Step 3. Clone the agentic RAG project
2025-11-25 15:18:51 +00:00
You'll then clone the pre-built agentic RAG project from GitHub into your AI Workbench environment.
2025-10-03 20:46:11 +00:00
2025-11-25 15:18:51 +00:00
From the AI Workbench landing page, select the **Local** location, if not done so already, then click "Clone Project" from the top right corner.
2025-10-03 20:46:11 +00:00
2025-10-05 18:20:38 +00:00
Paste this Git repository URL in the clone dialog: https://github.com/NVIDIA/workbench-example-agentic-rag
2025-10-03 20:46:11 +00:00
2025-11-25 15:18:51 +00:00
Click "Clone" to begin the clone and build process.
2025-10-03 20:46:11 +00:00
## Step 4. Configure project secrets
2025-11-25 15:18:51 +00:00
You can then configure the API keys required for the agentic RAG application to function properly.
2025-10-03 20:46:11 +00:00
While the project builds, configure the API keys using the yellow warning banner that appears:
2025-11-25 15:18:51 +00:00
1. Click "Configure" in the yellow banner
2025-10-03 20:46:11 +00:00
2. Enter your ``NVIDIA_API_KEY``
3. Enter your ``TAVILY_API_KEY``
4. Save the configuration
Wait for the project build to complete before proceeding.
## Step 5. Launch the chat application
2025-11-25 15:18:51 +00:00
You can now start the web-based chat interface where you can interact with the agentic RAG system.
2025-10-03 20:46:11 +00:00
Navigate to **Environment** > **Project Container** > **Apps** > **Chat** and start the web application.
A browser window will open automatically and load with the Gradio chat interface.
## Step 6. Test the basic functionality
2025-11-25 15:18:51 +00:00
Verify the agentic RAG system is working by submitting a sample query.
2025-10-03 20:46:11 +00:00
In the chat application, click on or type a sample query such as: `How do I add an integration in the CLI?`
Wait for the agentic system to process and respond. The response, while general, should demonstrate intelligent routing and evaluation.
## Step 7. Validate project
2025-11-25 15:18:51 +00:00
Confirm your setup is working correctly by testing the core features.
2025-10-03 20:46:11 +00:00
Verify the following components are functioning:
2025-11-25 15:18:51 +00:00
* Web application loads without errors
* Sample queries return responses
* No API authentication errors appear
* The agentic reasoning process is visible in the interface under "Monitor"
2025-10-03 20:46:11 +00:00
## Step 8. Complete optional quickstart
2025-11-25 15:18:51 +00:00
You can evaluate advanced features by uploading data, retrieving context, and testing custom queries.
2025-10-03 20:46:11 +00:00
2025-10-05 19:47:05 +00:00
**Substep A: Upload sample dataset**
2025-10-03 20:46:11 +00:00
Complete the in-app quickstart instructions to upload the sample dataset and test improved RAG-based responses.
2025-10-05 19:47:05 +00:00
**Substep B: Test custom dataset (optional)**
2025-10-03 20:46:11 +00:00
Upload a custom dataset, adjust the Router prompt, and submit custom queries to test customization.
## Step 10. Cleanup and rollback
2025-11-25 15:18:51 +00:00
You can remove the project if needed.
2025-10-03 20:46:11 +00:00
2025-10-12 20:53:42 +00:00
> [!WARNING]
> This will permanently delete the project and all associated data.
2025-10-03 20:46:11 +00:00
To remove the project completely:
1. In AI Workbench, click on the three dots next to a project
2. Select "Delete Project"
3. Confirm deletion when prompted
2025-11-25 15:18:51 +00:00
> [!NOTE]
> All changes are contained within AI Workbench. No system-level modifications were made outside the AI Workbench environment.
2025-10-03 20:46:11 +00:00
## Step 11. Next steps
2025-11-25 15:18:51 +00:00
You can also explore further advanced features and development options with the agentic RAG system:
2025-10-03 20:46:11 +00:00
* Modify component prompts in the project code
* Upload different documents to test routing and customization
* Experiment with different query types and complexity levels
* Review the agentic reasoning logs in the "Monitor" tab to understand decision-making
Consider customizing the Gradio UI or integrating the agentic RAG components into your own projects.
2025-10-10 00:11:49 +00:00
## Troubleshooting
| Symptom | Cause | Fix |
|---------|-------|-----|
| Tavily API Error | Internet connection or DNS issues | Wait and retry query |
| 401 Unauthorized | Wrong or malformed API key | Replace key in Project Secrets and restart |
| 403 Unauthorized | API key lacks permissions | Generate new key with proper access |
| Agentic loop timeout | Complex query exceeding time limit | Try simpler query or retry |
2025-11-25 15:18:51 +00:00
For latest known issues, please review the [DGX Spark User Guide](https://docs.nvidia.com/dgx/dgx-spark/known-issues.html).