diff --git a/nvidia/connect-two-sparks/README.md b/nvidia/connect-two-sparks/README.md
index a187e12..1a138ba 100644
--- a/nvidia/connect-two-sparks/README.md
+++ b/nvidia/connect-two-sparks/README.md
@@ -125,7 +125,8 @@ sudo chmod 600 /etc/netplan/40-cx7.yaml
 sudo netplan apply
 ```
 
-Note: Using this option, the IPs assigned to the interfaces will change if you reboot the system.
+> [!NOTE]
+> Using this option, the IPs assigned to the interfaces will change if you reboot the system.
 
 **Option 2: Manual IP Assignment (Advanced)**
 
@@ -187,7 +188,8 @@ You may be prompted for your password for each node.
 SSH setup complete! Both local and remote nodes can now SSH to each other without passwords.
 ```
 
-Note: If you encounter any errors, please follow Option 2 below to manually configure SSH and debug the issue.
+> [!NOTE]
+> If you encounter any errors, please follow Option 2 below to manually configure SSH and debug the issue.
 
 #### Option 2: Manually discover and configure SSH
 
diff --git a/nvidia/multi-modal-inference/README.md b/nvidia/multi-modal-inference/README.md
index 0e0e0d3..0c08ccb 100644
--- a/nvidia/multi-modal-inference/README.md
+++ b/nvidia/multi-modal-inference/README.md
@@ -12,7 +12,7 @@
 
 ## Overview
 
-* Basic idea
+## Basic idea
 
 Multi-modal inference combines different data types, such as **text, images, and audio**, within a single model pipeline to generate or interpret richer outputs.  
 Instead of processing one input type at a time, multi-modal systems have shared representations that  **text-to-image generation**, **image captioning**, or **vision-language reasoning**.  
@@ -54,12 +54,16 @@ All necessary files can be found in the TensorRT repository [here on GitHub](htt
 
 ## Time & risk
 
-**Duration**: 45-90 minutes depending on model downloads and optimization steps
+- **Duration**: 45-90 minutes depending on model downloads and optimization steps
 
-**Risks**: Large model downloads may timeout; high VRAM requirements may cause OOM errors; 
-quantized models may show quality degradation
+- **Risks**: 
+  - Large model downloads may timeout
+  - High VRAM requirements may cause OOM errors
+  - Quantized models may show quality degradation
 
-**Rollback**: Remove downloaded models from HuggingFace cache, exit container environment
+- **Rollback**: 
+  - Remove downloaded models from HuggingFace cache
+  - Then exit the container environment
 
 ## Instructions
 
diff --git a/nvidia/nccl/README.md b/nvidia/nccl/README.md
index 8892e29..a293075 100644
--- a/nvidia/nccl/README.md
+++ b/nvidia/nccl/README.md
@@ -12,7 +12,7 @@
 
 ## Overview
 
-## Basic Idea
+## Basic idea
 
 NCCL (NVIDIA Collective Communication Library) enables high-performance GPU-to-GPU communication
 across multiple nodes. This walkthrough sets up NCCL for multi-node distributed training on
@@ -41,9 +41,9 @@ and proper GPU topology detection.
 
 ## Time & risk
 
-* **Duration**: 30 minutes for setup and validation
-* **Risk level**: Medium - involves network configuration changes
-* **Rollback**: The NCCL & NCCL Tests repositories can be deleted from DGX Spark
+- **Duration**: 30 minutes for setup and validation
+- **Risk level**: Medium - involves network configuration changes
+- **Rollback**: The NCCL & NCCL Tests repositories can be deleted from DGX Spark
 
 ## Run on two Sparks
 
diff --git a/nvidia/nvfp4-quantization/README.md b/nvidia/nvfp4-quantization/README.md
index e7bdef7..bda636b 100644
--- a/nvidia/nvfp4-quantization/README.md
+++ b/nvidia/nvfp4-quantization/README.md
@@ -5,7 +5,6 @@
 ## Table of Contents
 
 - [Overview](#overview)
-  - [Basic Idea](#basic-idea)
 - [Instructions](#instructions)
 - [Troubleshooting](#troubleshooting)
 
@@ -14,7 +13,6 @@
 ## Overview
 
 ## Basic idea
-### Basic Idea
 
 NVFP4 is a 4-bit floating-point format introduced with NVIDIA Blackwell GPUs to maintain model accuracy while reducing memory bandwidth and storage requirements for inference workloads. 
 Unlike uniform INT4 quantization, NVFP4 retains floating-point semantics with a shared exponent and a compact mantissa, allowing higher dynamic range and more stable convergence.
diff --git a/nvidia/open-webui/README.md b/nvidia/open-webui/README.md
index 479c58c..430551a 100644
--- a/nvidia/open-webui/README.md
+++ b/nvidia/open-webui/README.md
@@ -116,11 +116,7 @@ Once complete, select "gpt-oss:20b" from the model dropdown.
 This step verifies that the complete setup is working properly by testing model
 inference through the web interface.
 
-In the chat textarea at the bottom of the Open WebUI interface, enter:
-
-```
-Write me a haiku about GPUs
-```
+In the chat text area at the bottom of the Open WebUI interface, enter: **Write me a haiku about GPUs**
 
 Press Enter to send the message and wait for the model's response.
 
@@ -303,11 +299,7 @@ Once complete, select "gpt-oss:20b" from the model dropdown.
 
 ## Step 8. Test the model
 
-In the chat textarea at the bottom of the Open WebUI interface, enter:
-
-```
-Write me a haiku about GPUs
-```
+In the chat textarea at the bottom of the Open WebUI interface, enter: **Write me a haiku about GPUs**
 
 Press Enter to send the message and wait for the model's response.
 
diff --git a/nvidia/trt-llm/README.md b/nvidia/trt-llm/README.md
index 4fa2f95..a17ed2c 100644
--- a/nvidia/trt-llm/README.md
+++ b/nvidia/trt-llm/README.md
@@ -31,10 +31,10 @@
   - [Step 14. Cleanup and rollback](#step-14-cleanup-and-rollback)
   - [Step 15. Next steps](#step-15-next-steps)
 - [Open WebUI for TensorRT-LLM](#open-webui-for-tensorrt-llm)
-  - [Prerequisites](#prerequisites)
-  - [Step 1. Launch Open WebUI container](#step-1-launch-open-webui-container)
-  - [Step 2. Access the interface](#step-2-access-the-interface)
-  - [Step 3. Cleanup and rollback](#step-3-cleanup-and-rollback)
+  - [Step 1. Set up the prerequisites to use Open WebUI with TRT-LLM](#step-1-set-up-the-prerequisites-to-use-open-webui-with-trt-llm)
+  - [Step 2. Launch Open WebUI container](#step-2-launch-open-webui-container)
+  - [Step 3. Access the Open WebUI interface](#step-3-access-the-open-webui-interface)
+  - [Step 4. Cleanup and rollback](#step-4-cleanup-and-rollback)
 - [Troubleshooting](#troubleshooting)
 
 ---
@@ -650,17 +650,17 @@ You can now deploy other models on your DGX Spark cluster.
 
 ## Open WebUI for TensorRT-LLM
 
-## Open WebUI for TensorRT-LLM
+### Step 1. Set up the prerequisites to use Open WebUI with TRT-LLM
 
-After setting up TensorRT-LLM inference server in either single-node or multi-node configuration, you can deploy Open WebUI to interact with your models through a user-friendly interface.
-
-### Prerequisites
+After setting up TensorRT-LLM inference server in either single-node or multi-node configuration, 
+you can deploy Open WebUI to interact with your models through Open WebUI. To get setup, just make sure the following 
+is in order
 
 - TensorRT-LLM inference server running and accessible at http://localhost:8355
 - Docker installed and configured (see earlier steps)
 - Port 3000 available on your DGX Spark
 
-### Step 1. Launch Open WebUI container
+### Step 2. Launch Open WebUI container
 
 Run the following command on the DGX Spark node where you have the TensorRT-LLM inference server running.
 For multi-node setup, this would be the primary node.
@@ -687,7 +687,7 @@ This command:
 - Enables automatic container restart
 - Uses the latest Open WebUI image
 
-### Step 2. Access the interface
+### Step 3. Access the Open WebUI interface
 
 Open your web browser and navigate to:
 
@@ -706,7 +706,7 @@ You can select your model(s) from the dropdown menu on the top left corner. That
 > [!NOTE]
 > If accessing from a remote machine, replace localhost with your DGX Spark's IP address.
 
-### Step 3. Cleanup and rollback
+### Step 4. Cleanup and rollback
 > [!WARNING]
 > This removes all chat data and may require re-uploading for future runs.
 
diff --git a/nvidia/txt2kg/README.md b/nvidia/txt2kg/README.md
index 6d36675..c384837 100644
--- a/nvidia/txt2kg/README.md
+++ b/nvidia/txt2kg/README.md
@@ -43,16 +43,16 @@ The setup includes:
 
 ## Time & risk
 
-**Duration**:
-- 2-3 minutes for initial setup and container deployment
-- 5-10 minutes for Ollama model download (depending on model size)
-- Immediate document processing and knowledge graph generation
+- **Duration**:
+  - 2-3 minutes for initial setup and container deployment
+  - 5-10 minutes for Ollama model download (depending on model size)
+  - Immediate document processing and knowledge graph generation
 
-**Risks**:
-- GPU memory requirements depend on chosen Ollama model size
-- Document processing time scales with document size and complexity
+- **Risks**:
+  - GPU memory requirements depend on chosen Ollama model size
+  - Document processing time scales with document size and complexity
 
-**Rollback**: Stop and remove Docker containers, delete downloaded models if needed
+- **Rollback**: Stop and remove Docker containers, delete downloaded models if needed
 
 ## Instructions