From 316b9a41fa03571fa98662a12fc5bf0ce5142faf Mon Sep 17 00:00:00 2001
From: GitLab CI <automaton@nvidia.com>
Date: Tue, 7 Oct 2025 17:40:52 +0000
Subject: [PATCH] chore: Regenerate all playbooks

---
 nvidia/speculative-decoding/README.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/nvidia/speculative-decoding/README.md b/nvidia/speculative-decoding/README.md
index 494dd39..6da7f6f 100644
--- a/nvidia/speculative-decoding/README.md
+++ b/nvidia/speculative-decoding/README.md
@@ -5,7 +5,7 @@
 ## Table of Contents
 
 - [Overview](#overview)
-- [How to run inference with speculative decoding](#how-to-run-inference-with-speculative-decoding)
+- [Instructions](#instructions)
   - [Step 1. Configure Docker permissions](#step-1-configure-docker-permissions)
   - [Step 2. Run Draft-Target Speculative Decoding](#step-2-run-draft-target-speculative-decoding)
   - [Step 3. Test the Draft-Target setup](#step-3-test-the-draft-target-setup)
@@ -57,7 +57,7 @@ These examples demonstrate how to accelerate large language model inference whil
 
 **Rollback:** Stop Docker containers and optionally clean up downloaded model cache
 
-## How to run inference with speculative decoding
+## Instructions
 
 ## Traditional Draft-Target Speculative Decoding
 
@@ -169,3 +169,4 @@ docker stop <container_id>
 - Experiment with different `max_draft_len` values (1, 2, 3, 4, 8)
 - Monitor token acceptance rates and throughput improvements
 - Test with different prompt lengths and generation parameters
+- Read more on Speculative Decoding [here](https://nvidia.github.io/TensorRT-LLM/advanced/speculative-decoding.html)