Comprehensive cleanup of ALL proprietary terms: **1. API Format Renaming:** - msi-genai → generic-genai (everywhere) - is_msi_genai_format() → is_generic_genai_format() - chat_msi_genai() → chat_generic_genai() - All test function names updated **2. Vesta/VNXT Complete Removal:** - VESTA NXT → DevOps Platform - All vesta/vnxt references → platform/devops - Files: CHANGELOG.md, query_expansion.rs, domainPrompts.ts - Fixed test expectations (removed nxt keyword check) **3. CI Workflow Fix:** - Moved Node.js installation BEFORE cache action - actions/cache@v4 requires Node to be installed first - Fixes: 'exec: "node": executable file not found in /Users/sarman/.local/bin:/Users/sarman/.bun/bin:/Users/sarman/.codeium/windsurf/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/Users/sarman/.local/bin:/Users/sarman/.opencode/bin:/Users/sarman/.cargo/bin:/opt/homebrew/opt/gnu-sed/libexec/gnubin:/Library/Frameworks/Python.framework/Versions/3.6/bin:/opt/local/bin:/opt/local/sbin:/usr/local/opt/coreutils/libexec/gnubin:/opt/metasploit-framework/bin:/Users/sarman/git/SQL:/Users/sarman/git/mass-scripts:/Users/sarman/gitpersonal:/Users/sarman/git/scripts:/Users/sarman/git/sysadmin-util:/usr/local/mysql/bin:/opt/bin/:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Library/Apple/usr/bin:/Applications/iTerm.app/Contents/Resources/utilities:/libexec/bin:/Users/sarman/bin/:/Users/sarman/bin/mass_scripts/:/usr/local/Cellar/mysql/5.7.21/bin:/usr/local/mariadb10/bin:/Users/sarman/bin/scripts:/Users/sarman/bin/SQL/:/Users/sarman/bin/bert_scripts/:/Users/sarman/bin/ecw/:/Users/sarman/bin/mass-scripts/:/Users/sarman/bin/nhudson:/Users/sarman/bin/personal/:/Users/sarman/bin/python_learning/:/Users/sarman/bin/svn/:/Users/sarman/sysadmin-util/:/Users/sarman/was_scripts/:/Users/sarman/.lmstudio/bin:/Users/sarman/.lmstudio/bin:/Users/sarman/.claude/plugins/cache/claude-plugins-official/swift-lsp/1.0.0/bin:/Users/sarman/.claude/plugins/cache/claude-plugins-official/rust-analyzer-lsp/1.0.0/bin:/Users/sarman/.claude/plugins/cache/knowledge-work-plugins/productivity/1.3.0/bin:/Users/sarman/.claude/plugins/cache/knowledge-work-plugins/customer-support/1.3.0/bin:/Users/sarman/.claude/plugins/cache/knowledge-work-plugins/product-management/1.2.0/bin:/Users/sarman/.claude/plugins/cache/knowledge-work-plugins/engineering/1.2.0/bin' **4. Preserved:** - .msi file extension (Windows installer format - valid) - .exe file extension (Windows executable - valid) **Verification:** - ✅ 308 Rust tests passing - ✅ 92 frontend tests passing - ✅ Zero proprietary references remaining Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
14 KiB
AI Providers
TRCAA supports 6+ AI providers, including custom providers with flexible authentication and API formats. API keys are stored encrypted with AES-256-GCM.
Provider Factory
ai/provider.rs::create_provider(config) dispatches on config.name to the matching implementation. Adding a provider requires implementing the Provider trait and adding a match arm.
pub trait Provider {
async fn chat(&self, messages: Vec<Message>, config: &ProviderConfig) -> Result<ChatResponse>;
fn name(&self) -> &str;
}
Supported Providers
1. OpenAI-Compatible
Covers: OpenAI, Azure OpenAI, LM Studio, vLLM, LiteLLM (AWS Bedrock), and any OpenAI-API-compatible endpoint.
| Field | Value |
|---|---|
config.name |
"openai" |
| Default URL | https://api.openai.com/v1/chat/completions |
| Auth | Authorization: Bearer <api_key> |
| Max tokens | 4096 |
Models: gpt-4o, gpt-4o-mini, gpt-4-turbo
Custom endpoint: Set config.base_url to any OpenAI-compatible API:
- LM Studio:
http://localhost:1234/v1 - LiteLLM (AWS Bedrock):
http://localhost:8000/v1— See LiteLLM + Bedrock Setup for full configuration guide
2. Anthropic Claude
| Field | Value |
|---|---|
config.name |
"anthropic" |
| URL | https://api.anthropic.com/v1/messages |
| Auth | x-api-key: <api_key> + anthropic-version: 2023-06-01 |
| Max tokens | 4096 |
Models: claude-sonnet-4-20250514, claude-haiku-4-20250414, claude-3-5-sonnet-20241022
3. Google Gemini
| Field | Value |
|---|---|
config.name |
"gemini" |
| URL | https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent |
| Auth | x-goog-api-key: <api_key> header |
| Max tokens | 4096 |
Models: gemini-2.0-flash, gemini-2.0-pro, gemini-1.5-pro, gemini-1.5-flash
Transport Security Notes
- Provider clients use TLS certificate verification via
reqwest - Provider calls are configured with explicit request timeouts to avoid indefinite hangs
- Credentials are sent in headers (not URL query strings)
4. Mistral AI
| Field | Value |
|---|---|
config.name |
"mistral" |
| Default URL | https://api.mistral.ai/v1/chat/completions |
| Auth | Authorization: Bearer <api_key> |
| Max tokens | 4096 |
Models: mistral-large-latest, mistral-medium-latest, mistral-small-latest, open-mistral-nemo
Uses OpenAI-compatible request/response format.
5. Ollama (Local / Offline)
| Field | Value |
|---|---|
config.name |
"ollama" |
| Default URL | http://localhost:11434 |
| Auth | None |
| Max tokens | No limit enforced |
| Tool Calling | ✅ Fully Supported (v1.0.7+) |
| Timeout | 180s (tool calling), 60s (regular chat) |
| Retry Logic | 3 attempts with 2s delay |
Recommended Models (≥3B parameters):
| Model | Size | Min RAM | Notes |
|---|---|---|---|
llama3.2:3b |
2.0 GB | 6 GB | Balanced performance |
phi3.5:3.8b |
2.2 GB | 6 GB | Excellent reasoning |
llama3.1:8b |
4.7 GB | 10 GB | RECOMMENDED - Strong IT analysis |
qwen2.5:14b |
9.0 GB | 16 GB | Best for complex log analysis |
gemma2:9b |
5.5 GB | 12 GB | Google's efficient model |
⚠️ Important: Models with <3B parameters (e.g., llama3.2:1b) cannot reliably follow tool calling instructions. They will describe tools instead of invoking them.
Features:
- ✅ Function Calling Support (v1.0.7): Executes shell commands, kubectl operations
- ✅ Multi-turn Tool Conversations: Preserves
tool_call_idfor correlation - ✅ Resilient Parsing: Skips malformed tool calls with warnings
- ✅ Connection Reliability (v1.0.8): Health checks, retry logic, extended timeouts
- ✅ Auto-Start: Automatically starts Ollama service if not running
- ✅ Fully Offline: No internet required, complete privacy
Custom URL: Change the Ollama URL in Settings → AI Providers → Ollama (stored in settingsStore.ollama_url).
Troubleshooting:
| Error | Cause | Solution |
|---|---|---|
| "Cannot connect to Ollama" | Service not running | Run ollama serve or check auto-start |
| Timeout after 60s (chat) / 180s (tool calling) | Model too slow / tool calling needs more time | Use a smaller model, reduce tool usage, or wait for the higher tool-calling timeout to elapse |
| Tool calls described but not executed | Model too small (<3B) | Use llama3.2:3b or larger |
| Model not loaded | First request loads model | Wait 5-10s for model to load into VRAM |
Performance Tips:
- Use quantized models (Q4_K_M or Q4_0) for faster responses
- Keep model loaded with
ollama run <model>in background - Monitor VRAM usage - models stay loaded for 5 minutes by default
Domain System Prompts
Each triage conversation is pre-loaded with a domain-specific expert system prompt from src/lib/domainPrompts.ts.
| Domain | Key areas covered |
|---|---|
| Linux | systemd, filesystem, memory, networking, kernel, performance |
| Windows | Event Viewer, Active Directory, IIS, Group Policy, clustering |
| Network | DNS, firewalls, load balancers, BGP/OSPF, Layer 2, VPN |
| Kubernetes | Pod failures, service mesh, ingress, storage, Helm |
| Databases | Connection pools, slow queries, indexes, replication, MongoDB/Redis |
| Virtualization | vMotion, storage (VMFS), HA, snapshots; KVM/QEMU |
| Hardware | RAID, SMART data, ECC memory errors, thermal, BIOS/firmware |
| Observability | Prometheus/Grafana, ELK/OpenSearch, tracing, SLO/SLI burn rates |
The domain prompt is injected as the first system role message in every new conversation.
6. Custom Provider (Multiple API Formats)
Status: ✅ Implemented (v0.2.6)
Custom providers allow integration with non-OpenAI-compatible APIs. The application supports multiple API formats:
Format: OpenAI Compatible (Default)
Standard OpenAI /chat/completions endpoint with Bearer authentication.
| Field | Default Value |
|---|---|
api_format |
"openai" |
custom_endpoint_path |
/chat/completions |
custom_auth_header |
Authorization |
custom_auth_prefix |
Bearer |
Use cases:
- Self-hosted LLMs with OpenAI-compatible APIs
- Custom proxy services
- Enterprise gateways
Format: TFTSR GenAI
TFTSR GenAI Gateway — Enterprise AI gateway with model proxying and cost tracking.
| Field | Value |
|---|---|
config.provider_type |
"custom" |
config.api_format |
"generic-genai" |
| Status | ⚠️ Limited compatibility |
Known Limitations:
- ❌ Tool calling not supported: Gateway returns
503 Service Unavailablewith error"Gemini Filter Triggered: UNEXPECTED_TOOL_CALL" - ❌ Shell execution unavailable: Cannot use
execute_shell_commandor other function calling features - ✅ Basic chat works: Text-only conversations function correctly
- ✅ Workaround parser included: Attempts to extract tool calls from malformed responses (ChatGPT JSON in
msgfield, Claude XML wrapper)
Recommendation: Use LiteLLM + AWS Bedrock (see LiteLLM Setup Guide) or Ollama for full tool calling support.
Root Cause: TFTSR GenAI gateway applies content filtering that blocks structured tool call responses before they reach the client. This is a gateway-level restriction that cannot be worked around from the client side.
Configuration (if needed for text-only use):
Name: TFTSR GenAI
Type: Custom
API Format: TFTSR GenAI
API URL: https://your-gateway/api/v2/chat
Model: your-model-name
API Key: (your API key)
User ID: user@example.com (optional, for cost tracking)
Format: Custom REST (Generic)
Generic Enterprise AI Gateway — For AI platforms that use a non-OpenAI request/response format with centralized cost tracking and model access.
| Field | Value |
|---|---|
config.provider_type |
"custom" |
config.api_format |
"custom_rest" |
| API URL | Your gateway's base URL |
| Auth Header | Your gateway's auth header name |
| Auth Prefix | `` (empty if no prefix needed) |
| Endpoint Path | `` (empty if URL already includes full path) |
Request Format:
{
"model": "model-name",
"prompt": "User's latest message",
"system": "Optional system prompt",
"sessionId": "uuid-for-conversation-continuity",
"userId": "user@example.com"
}
Response Format:
{
"status": true,
"sessionId": "uuid",
"msg": "AI response text",
"initialPrompt": false
}
Key Differences from OpenAI:
- Single prompt instead of message array (server manages history via
sessionId) - Response in
msgfield instead ofchoices[0].message.content - Session-based conversation continuity (no need to resend history)
- Cost tracking via
userIdfield (optional)
Configuration (Settings → AI Providers → Add Provider):
Name: Custom REST Gateway
Type: Custom
API Format: Custom REST
API URL: https://your-gateway/api/v2/chat
Model: your-model-name
API Key: (your API key)
User ID: user@example.com (optional, for cost tracking)
Endpoint Path: (leave empty if URL includes full path)
Auth Header: x-custom-api-key
Auth Prefix: (leave empty if no prefix)
Troubleshooting:
| Error | Cause | Solution |
|---|---|---|
| 403 Forbidden | Invalid API key or insufficient permissions | Verify key in your gateway portal, check model access |
Missing userId field |
Configuration not saved | Ensure UI shows User ID field when api_format=custom_rest |
| No conversation history | sessionId not persisted |
Session ID stored in ProviderConfig.session_id — currently per-provider, not per-conversation |
Implementation Details:
- Backend:
src-tauri/src/ai/openai.rs::chat_custom_rest() - Schema:
src-tauri/src/state.rs::ProviderConfig(addeduser_id,api_format, custom auth fields) - Frontend:
src/pages/Settings/AIProviders.tsx(conditional UI for Custom REST + model dropdown)
Custom Provider Configuration Fields
All providers support the following optional configuration fields (v0.2.6+):
| Field | Type | Purpose | Default |
|---|---|---|---|
custom_endpoint_path |
Option<String> |
Override endpoint path | /chat/completions |
custom_auth_header |
Option<String> |
Custom auth header name | Authorization |
custom_auth_prefix |
Option<String> |
Prefix before API key | Bearer |
api_format |
Option<String> |
API format (openai or custom_rest) |
openai |
session_id |
Option<String> |
Session ID for stateful APIs | None |
user_id |
Option<String> |
User ID for cost tracking (Custom REST gateways) | None |
supports_tool_calling |
Option<bool> |
Enable function/tool calling | true for built-in providers, false for custom |
Backward Compatibility: All fields are optional and default to OpenAI-compatible behavior. Existing provider configurations are unaffected.
Tool Calling Auto-Detection
Status: ✅ Implemented (v1.0.9+)
TRCAA can automatically detect whether a custom AI provider supports tool calling (function calling) by sending a test tool call and analyzing the response.
How It Works
- Navigate to Settings → AI Providers → Add/Edit Custom Provider
- Configure your provider (API URL, key, model)
- Click "Auto-Detect Tool Calling Support" button
- System sends a simple test tool call to the provider
- Checkbox automatically enabled/disabled based on result
- Success/warning message displayed
Detection Criteria
| Scenario | Result | Explanation |
|---|---|---|
Provider returns tool_calls array with test tool |
✅ Tool calling supported | Checkbox enabled automatically |
| Provider responds without tool_calls | ⚠️ Not supported | Checkbox disabled automatically |
| Gateway returns 503 / "tool" error | ⚠️ Blocked at gateway level | Checkbox disabled (e.g., TFTSR GenAI) |
| Connection/auth/timeout error | ❌ Error displayed | User must fix connection issue |
Test Tool
The auto-detection sends this minimal tool:
{
"name": "test_tool",
"description": "A test tool that returns 'success'. Call this tool with no arguments.",
"parameters": {
"type": "object",
"properties": {},
"required": []
}
}
Known Limitations
- TFTSR GenAI: Gateway blocks tool calls with
503 UNEXPECTED_TOOL_CALLbefore they reach the model. Auto-detect correctly identifies this as "not supported." - Small Models: Models <3B parameters (e.g.,
llama3.2:1b) may respond but describe tools instead of calling them. Auto-detect may returntrue(model capability) but runtime behavior will fail. - Timeout: Detection uses same timeout as regular chat (60-180s depending on provider). Slow providers may timeout during detection.
Manual Override
You can always manually toggle the supports_tool_calling checkbox:
- ✅ Enable: For providers you know support tool calling
- ❌ Disable: For text-only chat without shell execution or integrations
Adding a New Provider
- Create
src-tauri/src/ai/{name}.rsimplementing theProvidertrait - Add a match arm in
ai/provider.rs::create_provider() - Add the model list in
commands/ai.rs::list_providers() - Add the TypeScript type in
src/lib/tauriCommands.ts - Add a UI entry in
src/pages/Settings/AIProviders.tsx