Commit Graph

5 Commits

Author SHA1 Message Date
Santosh Bhavani
529debb633 perf(docker): increase Ollama parallel processing for DGX
- Increase OLLAMA_NUM_PARALLEL from 1 to 4 requests
- Leverage DGX Spark's unified memory architecture
- Improve throughput for concurrent inference requests
2025-10-19 20:56:58 -07:00
Santosh Bhavani
8c1d2ae9f3 feat(docker): add vector search services and GPU configuration
- Add optional Pinecone and sentence-transformers services for vector search
- Configure NVIDIA GPU support with proper environment variables
- Add new environment variables for embeddings and Pinecone
- Add docker compose profiles to optionally enable vector-search
- Improve CUDA configuration for Ollama service
- Add pinecone-net network for service communication
2025-10-19 19:56:55 -07:00
Santosh Bhavani
9dc734eee5 Add NVIDIA_API_KEY support and update ollama to v0.12.6 2025-10-19 14:52:24 -05:00
GitLab CI
89b4835335 chore: Regenerate all playbooks 2025-10-10 18:45:20 +00:00
GitLab CI
27fe116e71 chore: Regenerate all playbooks 2025-10-06 17:05:41 +00:00