docs: update wiki for timeline events and incident response methodology
Some checks failed
Test / rust-fmt-check (pull_request) Successful in 1m12s
Test / frontend-typecheck (pull_request) Successful in 1m17s
Test / frontend-tests (pull_request) Successful in 1m25s
PR Review Automation / review (pull_request) Failing after 2m45s
Test / rust-clippy (pull_request) Successful in 4m26s
Test / rust-tests (pull_request) Successful in 5m42s

- Database.md: document timeline_events table (migration 017), event
  types, dual-write strategy, correct migration count to 17
- IPC-Commands.md: document get_timeline_events, updated
  add_timeline_event with metadata, chat_message system_prompt param
- Architecture.md: document incident response methodology integration,
  5-phase framework, system prompt injection, correct migration count
This commit is contained in:
Shaun Arman 2026-04-19 18:26:21 -05:00
parent 8b0cbc3ce8
commit d715ba0b25
3 changed files with 128 additions and 9 deletions

View File

@ -50,7 +50,7 @@ All command handlers receive `State<'_, AppState>` as a Tauri-injected parameter
| `commands/integrations.rs` | Confluence / ServiceNow / ADO — v0.2 stubs |
| `ai/provider.rs` | `Provider` trait + `create_provider()` factory |
| `pii/detector.rs` | Multi-pattern PII scanner with overlap resolution |
| `db/migrations.rs` | Versioned schema (12 migrations in `_migrations` table) |
| `db/migrations.rs` | Versioned schema (17 migrations in `_migrations` table) |
| `db/models.rs` | All DB types — see `IssueDetail` note below |
| `docs/rca.rs` + `docs/postmortem.rs` | Markdown template builders |
| `audit/log.rs` | `write_audit_event()` — called before every external send |
@ -176,6 +176,55 @@ pub struct IssueDetail {
Use `detail.issue.title`, **not** `detail.title`.
## Incident Response Methodology
The application integrates a comprehensive incident response framework via system prompt injection. The `INCIDENT_RESPONSE_FRAMEWORK` constant in `src/lib/domainPrompts.ts` is appended to all 17 domain-specific system prompts (Linux, Windows, Network, Kubernetes, Databases, Virtualization, Hardware, Observability, and others).
**5-Phase Framework:**
1. **Detection & Evidence Gathering** — Initial issue assessment, log collection, PII redaction
2. **Diagnosis & Hypothesis Testing** — AI-assisted analysis, pattern matching against known incidents
3. **Root Cause Analysis with 5-Whys** — Iterative questioning to identify underlying cause (steps 15)
4. **Resolution & Prevention** — Remediation planning and implementation
5. **Post-Incident Review** — Timeline-based blameless post-mortem and lessons learned
**System Prompt Injection:**
The `chat_message` command accepts an optional `system_prompt` parameter. If provided, it prepends domain expertise before the conversation history. If omitted, the framework selects the appropriate domain prompt based on the issue category. This allows:
- **Specialized expertise**: Different frameworks for Linux vs. Kubernetes vs. Network incidents
- **Flexible override**: Users can inject custom system prompts for cross-domain problems
- **Consistent methodology**: All 17 domain prompts follow the same 5-phase incident response structure
**Timeline Event Recording:**
Timeline events are recorded non-blockingly at key triage moments:
```
Issue Creation → triage_started
Log Upload → log_uploaded (metadata: file_name, file_size)
Why-Level Progression → why_level_advanced (metadata: from_level → to_level)
Root Cause Identified → root_cause_identified (metadata: root_cause, confidence)
RCA Generated → rca_generated (metadata: doc_id, section_count)
Postmortem Generated → postmortem_generated (metadata: doc_id, timeline_events_count)
Document Exported → document_exported (metadata: format, file_path)
```
**Document Generation:**
RCA and Postmortem generators now use real timeline event data instead of placeholders:
- **RCA**: Incorporates timeline to show detection-to-root-cause progression
- **Postmortem**: Uses full timeline to demonstrate the complete incident lifecycle and response effectiveness
Timeline events are stored in the `timeline_events` table (indexed by issue_id and created_at for fast retrieval) and dual-written to `audit_log` for security/compliance purposes.
## Application Startup Sequence
```

View File

@ -2,7 +2,7 @@
## Overview
TFTSR uses **SQLite** via `rusqlite` with the `bundled-sqlcipher` feature for AES-256 encryption in production. 12 versioned migrations are tracked in the `_migrations` table.
TFTSR uses **SQLite** via `rusqlite` with the `bundled-sqlcipher` feature for AES-256 encryption in production. 17 versioned migrations are tracked in the `_migrations` table.
**DB file location:** `{app_data_dir}/tftsr.db`
@ -38,7 +38,7 @@ pub fn init_db(data_dir: &Path) -> anyhow::Result<Connection> {
---
## Schema (11 Migrations)
## Schema (17 Migrations)
### 001 — issues
@ -245,6 +245,51 @@ CREATE TABLE image_attachments (
- Basic auth (ServiceNow): Store encrypted password
- One credential per service (enforced by UNIQUE constraint)
### 017 — timeline_events (Incident Response Timeline)
```sql
CREATE TABLE timeline_events (
id TEXT PRIMARY KEY,
issue_id TEXT NOT NULL REFERENCES issues(id) ON DELETE CASCADE,
event_type TEXT NOT NULL,
description TEXT NOT NULL,
metadata TEXT, -- JSON object with event-specific data
created_at TEXT NOT NULL
);
CREATE INDEX idx_timeline_events_issue ON timeline_events(issue_id);
CREATE INDEX idx_timeline_events_time ON timeline_events(created_at);
```
**Event Types:**
- `triage_started` — Incident response begins, initial issue properties recorded
- `log_uploaded` — Log file uploaded and analyzed
- `why_level_advanced` — 5-Whys entry completed, progression to next level
- `root_cause_identified` — Root cause determined from analysis
- `rca_generated` — Root Cause Analysis document created
- `postmortem_generated` — Post-mortem document created
- `document_exported` — Document exported to file (MD or PDF)
**Metadata Structure (JSON):**
```json
{
"triage_started": {"severity": "high", "category": "network"},
"log_uploaded": {"file_name": "app.log", "file_size": 2048576},
"why_level_advanced": {"from_level": 2, "to_level": 3, "question": "Why did the service timeout?"},
"root_cause_identified": {"root_cause": "DNS resolution failure", "confidence": 0.95},
"rca_generated": {"doc_id": "doc_abc123", "section_count": 7},
"postmortem_generated": {"doc_id": "doc_def456", "timeline_events_count": 12},
"document_exported": {"format": "pdf", "file_path": "/home/user/docs/rca.pdf"}
}
```
**Design Notes:**
- Timeline events are **queryable** (indexed by issue_id and created_at) for document generation
- Dual-write: Events recorded to both `timeline_events` and `audit_log` — timeline for chronological reporting, audit_log for security/compliance
- `created_at`: TEXT UTC timestamp (`YYYY-MM-DD HH:MM:SS`)
- Non-blocking writes: Timeline events recorded asynchronously at key triage moments
- Cascade delete from issues ensures cleanup
---
## Key Design Notes
@ -289,4 +334,13 @@ pub struct AuditEntry {
pub user_id: String,
pub details: Option<String>,
}
pub struct TimelineEvent {
pub id: String,
pub issue_id: String,
pub event_type: String,
pub description: String,
pub metadata: Option<String>, // JSON
pub created_at: String,
}
```

View File

@ -62,11 +62,27 @@ updateFiveWhyCmd(entryId: string, answer: string) → void
```
Sets or updates the answer for an existing 5-Whys entry.
### `get_timeline_events`
```typescript
getTimelineEventsCmd(issueId: string) → TimelineEvent[]
```
Retrieves all timeline events for an issue, ordered by created_at ascending.
```typescript
interface TimelineEvent {
id: string;
issue_id: string;
event_type: string; // One of: triage_started, log_uploaded, why_level_advanced, etc.
description: string;
metadata?: Record<string, any>; // Event-specific JSON data
created_at: string; // UTC timestamp
}
```
### `add_timeline_event`
```typescript
addTimelineEventCmd(issueId: string, eventType: string, description: string) → TimelineEvent
addTimelineEventCmd(issueId: string, eventType: string, description: string, metadata?: Record<string, any>) → TimelineEvent
```
Records a timestamped event in the issue timeline.
Records a timestamped event in the issue timeline. Dual-writes to both `timeline_events` (for document generation) and `audit_log` (for security audit trail).
---
@ -137,9 +153,9 @@ Sends selected (redacted) log files to the AI provider with an analysis prompt.
### `chat_message`
```typescript
chatMessageCmd(issueId: string, message: string, providerConfig: ProviderConfig) → ChatResponse
chatMessageCmd(issueId: string, message: string, providerConfig: ProviderConfig, systemPrompt?: string) → ChatResponse
```
Sends a message in the ongoing triage conversation. Domain system prompt is injected automatically on first message. AI response is parsed for why-level indicators (15).
Sends a message in the ongoing triage conversation. Optional `systemPrompt` parameter allows prepending domain expertise before conversation history. If not provided, the domain-specific system prompt for the issue category is injected automatically on first message. AI response is parsed for why-level indicators (15).
### `list_providers`
```typescript
@ -155,13 +171,13 @@ Returns the list of supported providers with their available models and configur
```typescript
generateRcaCmd(issueId: string) → Document
```
Builds an RCA Markdown document from the issue data, 5-Whys answers, and timeline.
Builds an RCA Markdown document from the issue data, 5-Whys answers, and timeline events. Uses real incident response timeline (log uploads, why-level progression, root cause identification) instead of placeholders.
### `generate_postmortem`
```typescript
generatePostmortemCmd(issueId: string) → Document
```
Builds a blameless post-mortem Markdown document.
Builds a blameless post-mortem Markdown document. Incorporates timeline events to show the full incident lifecycle: detection, diagnosis, resolution, and post-incident review phases.
### `update_document`
```typescript