docs: update wiki for timeline events and incident response methodology
Some checks failed
Test / rust-fmt-check (pull_request) Successful in 1m12s
Test / frontend-typecheck (pull_request) Successful in 1m17s
Test / frontend-tests (pull_request) Successful in 1m25s
PR Review Automation / review (pull_request) Failing after 2m45s
Test / rust-clippy (pull_request) Successful in 4m26s
Test / rust-tests (pull_request) Successful in 5m42s
Some checks failed
Test / rust-fmt-check (pull_request) Successful in 1m12s
Test / frontend-typecheck (pull_request) Successful in 1m17s
Test / frontend-tests (pull_request) Successful in 1m25s
PR Review Automation / review (pull_request) Failing after 2m45s
Test / rust-clippy (pull_request) Successful in 4m26s
Test / rust-tests (pull_request) Successful in 5m42s
- Database.md: document timeline_events table (migration 017), event types, dual-write strategy, correct migration count to 17 - IPC-Commands.md: document get_timeline_events, updated add_timeline_event with metadata, chat_message system_prompt param - Architecture.md: document incident response methodology integration, 5-phase framework, system prompt injection, correct migration count
This commit is contained in:
parent
8b0cbc3ce8
commit
d715ba0b25
@ -50,7 +50,7 @@ All command handlers receive `State<'_, AppState>` as a Tauri-injected parameter
|
||||
| `commands/integrations.rs` | Confluence / ServiceNow / ADO — v0.2 stubs |
|
||||
| `ai/provider.rs` | `Provider` trait + `create_provider()` factory |
|
||||
| `pii/detector.rs` | Multi-pattern PII scanner with overlap resolution |
|
||||
| `db/migrations.rs` | Versioned schema (12 migrations in `_migrations` table) |
|
||||
| `db/migrations.rs` | Versioned schema (17 migrations in `_migrations` table) |
|
||||
| `db/models.rs` | All DB types — see `IssueDetail` note below |
|
||||
| `docs/rca.rs` + `docs/postmortem.rs` | Markdown template builders |
|
||||
| `audit/log.rs` | `write_audit_event()` — called before every external send |
|
||||
@ -176,6 +176,55 @@ pub struct IssueDetail {
|
||||
|
||||
Use `detail.issue.title`, **not** `detail.title`.
|
||||
|
||||
## Incident Response Methodology
|
||||
|
||||
The application integrates a comprehensive incident response framework via system prompt injection. The `INCIDENT_RESPONSE_FRAMEWORK` constant in `src/lib/domainPrompts.ts` is appended to all 17 domain-specific system prompts (Linux, Windows, Network, Kubernetes, Databases, Virtualization, Hardware, Observability, and others).
|
||||
|
||||
**5-Phase Framework:**
|
||||
|
||||
1. **Detection & Evidence Gathering** — Initial issue assessment, log collection, PII redaction
|
||||
2. **Diagnosis & Hypothesis Testing** — AI-assisted analysis, pattern matching against known incidents
|
||||
3. **Root Cause Analysis with 5-Whys** — Iterative questioning to identify underlying cause (steps 1–5)
|
||||
4. **Resolution & Prevention** — Remediation planning and implementation
|
||||
5. **Post-Incident Review** — Timeline-based blameless post-mortem and lessons learned
|
||||
|
||||
**System Prompt Injection:**
|
||||
|
||||
The `chat_message` command accepts an optional `system_prompt` parameter. If provided, it prepends domain expertise before the conversation history. If omitted, the framework selects the appropriate domain prompt based on the issue category. This allows:
|
||||
|
||||
- **Specialized expertise**: Different frameworks for Linux vs. Kubernetes vs. Network incidents
|
||||
- **Flexible override**: Users can inject custom system prompts for cross-domain problems
|
||||
- **Consistent methodology**: All 17 domain prompts follow the same 5-phase incident response structure
|
||||
|
||||
**Timeline Event Recording:**
|
||||
|
||||
Timeline events are recorded non-blockingly at key triage moments:
|
||||
|
||||
```
|
||||
Issue Creation → triage_started
|
||||
↓
|
||||
Log Upload → log_uploaded (metadata: file_name, file_size)
|
||||
↓
|
||||
Why-Level Progression → why_level_advanced (metadata: from_level → to_level)
|
||||
↓
|
||||
Root Cause Identified → root_cause_identified (metadata: root_cause, confidence)
|
||||
↓
|
||||
RCA Generated → rca_generated (metadata: doc_id, section_count)
|
||||
↓
|
||||
Postmortem Generated → postmortem_generated (metadata: doc_id, timeline_events_count)
|
||||
↓
|
||||
Document Exported → document_exported (metadata: format, file_path)
|
||||
```
|
||||
|
||||
**Document Generation:**
|
||||
|
||||
RCA and Postmortem generators now use real timeline event data instead of placeholders:
|
||||
|
||||
- **RCA**: Incorporates timeline to show detection-to-root-cause progression
|
||||
- **Postmortem**: Uses full timeline to demonstrate the complete incident lifecycle and response effectiveness
|
||||
|
||||
Timeline events are stored in the `timeline_events` table (indexed by issue_id and created_at for fast retrieval) and dual-written to `audit_log` for security/compliance purposes.
|
||||
|
||||
## Application Startup Sequence
|
||||
|
||||
```
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
|
||||
## Overview
|
||||
|
||||
TFTSR uses **SQLite** via `rusqlite` with the `bundled-sqlcipher` feature for AES-256 encryption in production. 12 versioned migrations are tracked in the `_migrations` table.
|
||||
TFTSR uses **SQLite** via `rusqlite` with the `bundled-sqlcipher` feature for AES-256 encryption in production. 17 versioned migrations are tracked in the `_migrations` table.
|
||||
|
||||
**DB file location:** `{app_data_dir}/tftsr.db`
|
||||
|
||||
@ -38,7 +38,7 @@ pub fn init_db(data_dir: &Path) -> anyhow::Result<Connection> {
|
||||
|
||||
---
|
||||
|
||||
## Schema (11 Migrations)
|
||||
## Schema (17 Migrations)
|
||||
|
||||
### 001 — issues
|
||||
|
||||
@ -245,6 +245,51 @@ CREATE TABLE image_attachments (
|
||||
- Basic auth (ServiceNow): Store encrypted password
|
||||
- One credential per service (enforced by UNIQUE constraint)
|
||||
|
||||
### 017 — timeline_events (Incident Response Timeline)
|
||||
|
||||
```sql
|
||||
CREATE TABLE timeline_events (
|
||||
id TEXT PRIMARY KEY,
|
||||
issue_id TEXT NOT NULL REFERENCES issues(id) ON DELETE CASCADE,
|
||||
event_type TEXT NOT NULL,
|
||||
description TEXT NOT NULL,
|
||||
metadata TEXT, -- JSON object with event-specific data
|
||||
created_at TEXT NOT NULL
|
||||
);
|
||||
|
||||
CREATE INDEX idx_timeline_events_issue ON timeline_events(issue_id);
|
||||
CREATE INDEX idx_timeline_events_time ON timeline_events(created_at);
|
||||
```
|
||||
|
||||
**Event Types:**
|
||||
- `triage_started` — Incident response begins, initial issue properties recorded
|
||||
- `log_uploaded` — Log file uploaded and analyzed
|
||||
- `why_level_advanced` — 5-Whys entry completed, progression to next level
|
||||
- `root_cause_identified` — Root cause determined from analysis
|
||||
- `rca_generated` — Root Cause Analysis document created
|
||||
- `postmortem_generated` — Post-mortem document created
|
||||
- `document_exported` — Document exported to file (MD or PDF)
|
||||
|
||||
**Metadata Structure (JSON):**
|
||||
```json
|
||||
{
|
||||
"triage_started": {"severity": "high", "category": "network"},
|
||||
"log_uploaded": {"file_name": "app.log", "file_size": 2048576},
|
||||
"why_level_advanced": {"from_level": 2, "to_level": 3, "question": "Why did the service timeout?"},
|
||||
"root_cause_identified": {"root_cause": "DNS resolution failure", "confidence": 0.95},
|
||||
"rca_generated": {"doc_id": "doc_abc123", "section_count": 7},
|
||||
"postmortem_generated": {"doc_id": "doc_def456", "timeline_events_count": 12},
|
||||
"document_exported": {"format": "pdf", "file_path": "/home/user/docs/rca.pdf"}
|
||||
}
|
||||
```
|
||||
|
||||
**Design Notes:**
|
||||
- Timeline events are **queryable** (indexed by issue_id and created_at) for document generation
|
||||
- Dual-write: Events recorded to both `timeline_events` and `audit_log` — timeline for chronological reporting, audit_log for security/compliance
|
||||
- `created_at`: TEXT UTC timestamp (`YYYY-MM-DD HH:MM:SS`)
|
||||
- Non-blocking writes: Timeline events recorded asynchronously at key triage moments
|
||||
- Cascade delete from issues ensures cleanup
|
||||
|
||||
---
|
||||
|
||||
## Key Design Notes
|
||||
@ -289,4 +334,13 @@ pub struct AuditEntry {
|
||||
pub user_id: String,
|
||||
pub details: Option<String>,
|
||||
}
|
||||
|
||||
pub struct TimelineEvent {
|
||||
pub id: String,
|
||||
pub issue_id: String,
|
||||
pub event_type: String,
|
||||
pub description: String,
|
||||
pub metadata: Option<String>, // JSON
|
||||
pub created_at: String,
|
||||
}
|
||||
```
|
||||
|
||||
@ -62,11 +62,27 @@ updateFiveWhyCmd(entryId: string, answer: string) → void
|
||||
```
|
||||
Sets or updates the answer for an existing 5-Whys entry.
|
||||
|
||||
### `get_timeline_events`
|
||||
```typescript
|
||||
getTimelineEventsCmd(issueId: string) → TimelineEvent[]
|
||||
```
|
||||
Retrieves all timeline events for an issue, ordered by created_at ascending.
|
||||
```typescript
|
||||
interface TimelineEvent {
|
||||
id: string;
|
||||
issue_id: string;
|
||||
event_type: string; // One of: triage_started, log_uploaded, why_level_advanced, etc.
|
||||
description: string;
|
||||
metadata?: Record<string, any>; // Event-specific JSON data
|
||||
created_at: string; // UTC timestamp
|
||||
}
|
||||
```
|
||||
|
||||
### `add_timeline_event`
|
||||
```typescript
|
||||
addTimelineEventCmd(issueId: string, eventType: string, description: string) → TimelineEvent
|
||||
addTimelineEventCmd(issueId: string, eventType: string, description: string, metadata?: Record<string, any>) → TimelineEvent
|
||||
```
|
||||
Records a timestamped event in the issue timeline.
|
||||
Records a timestamped event in the issue timeline. Dual-writes to both `timeline_events` (for document generation) and `audit_log` (for security audit trail).
|
||||
|
||||
---
|
||||
|
||||
@ -137,9 +153,9 @@ Sends selected (redacted) log files to the AI provider with an analysis prompt.
|
||||
|
||||
### `chat_message`
|
||||
```typescript
|
||||
chatMessageCmd(issueId: string, message: string, providerConfig: ProviderConfig) → ChatResponse
|
||||
chatMessageCmd(issueId: string, message: string, providerConfig: ProviderConfig, systemPrompt?: string) → ChatResponse
|
||||
```
|
||||
Sends a message in the ongoing triage conversation. Domain system prompt is injected automatically on first message. AI response is parsed for why-level indicators (1–5).
|
||||
Sends a message in the ongoing triage conversation. Optional `systemPrompt` parameter allows prepending domain expertise before conversation history. If not provided, the domain-specific system prompt for the issue category is injected automatically on first message. AI response is parsed for why-level indicators (1–5).
|
||||
|
||||
### `list_providers`
|
||||
```typescript
|
||||
@ -155,13 +171,13 @@ Returns the list of supported providers with their available models and configur
|
||||
```typescript
|
||||
generateRcaCmd(issueId: string) → Document
|
||||
```
|
||||
Builds an RCA Markdown document from the issue data, 5-Whys answers, and timeline.
|
||||
Builds an RCA Markdown document from the issue data, 5-Whys answers, and timeline events. Uses real incident response timeline (log uploads, why-level progression, root cause identification) instead of placeholders.
|
||||
|
||||
### `generate_postmortem`
|
||||
```typescript
|
||||
generatePostmortemCmd(issueId: string) → Document
|
||||
```
|
||||
Builds a blameless post-mortem Markdown document.
|
||||
Builds a blameless post-mortem Markdown document. Incorporates timeline events to show the full incident lifecycle: detection, diagnosis, resolution, and post-incident review phases.
|
||||
|
||||
### `update_document`
|
||||
```typescript
|
||||
|
||||
Loading…
Reference in New Issue
Block a user