diff --git a/nvidia/nvfp4-quantization/README.md b/nvidia/nvfp4-quantization/README.md
index 3ee80ea..6ad35b4 100644
--- a/nvidia/nvfp4-quantization/README.md
+++ b/nvidia/nvfp4-quantization/README.md
@@ -22,7 +22,6 @@
   - Cuts memory use ~3.5x vs FP16 and ~1.8x vs FP8
   - Keeps accuracy close to FP8 (usually <1% loss)
   - Improves speed and energy efficiency for inference
-- **Ecosystem:** Supported in NVIDIA tools (TensorRT, LLM Compressor, vLLM) and Hugging Face models.
 
 
 ## What you'll accomplish
@@ -43,7 +42,7 @@ inside a TensorRT-LLM container, producing an NVFP4 quantized model for deployme
 - NVIDIA Spark device with Blackwell architecture GPU
 - Docker installed with GPU support
 - NVIDIA Container Toolkit configured
-- At least 32GB of available storage for model files and outputs
+- Available storage for model files and outputs
 - Hugging Face account with access to the target model
 
 Verify your setup:
@@ -53,9 +52,6 @@ docker run --rm --gpus all nvcr.io/nvidia/tensorrt-llm/release:spark-single-gpu-
 
 ## Verify sufficient disk space
 df -h .
-
-## Check Hugging Face CLI (install if needed: pip install huggingface_hub)
-huggingface-cli whoami
 ```
 
 ## Time & risk
@@ -133,7 +129,8 @@ docker run --rm -it --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=671
   "
 ```
 
-Warning: If your model is too large, you may encounter an out of memory error. You can try quantizing a smaller model instead.
+Note: You may encounter this `pynvml.NVMLError_NotSupported: Not Supported`. This is expected in some environments, does not affect results, and will be fixed in an upcoming release.
+Note: If your model is too large, you may encounter an out of memory error. You can try quantizing a smaller model instead.
 
 This command:
 - Runs the container with full GPU access and optimized shared memory settings