Workflow: Operations Checklist
Use this checklist for day-2 operations and incident triage.
1. Service Health and Readiness
curl -sS "https://k3.dev.dodil.io/health"
curl -sS "https://k3.dev.dodil.io/healthz"
curl -sS "https://k3.dev.dodil.io/ready"
curl -sS "https://k3.dev.dodil.io/readyz"
curl -sS "https://k3.dev.dodil.io/metrics"2. Bucket and Engine Baseline
dodil k3 bucket list
dodil k3 vector-store get --bucket "$K3_BUCKET"
dodil k3 engine get --bucket "$K3_BUCKET"3. Ingest Pipeline Health
dodil k3 ingest list --bucket "$K3_BUCKET"
dodil k3 ingest jobs --bucket "$K3_BUCKET" -o jsonFocus on:
- repeated
FAILEDorRETRYINGjobs - spikes in pending jobs
- mismatched source/rule/pipeline IDs
4. Search Quality Checks
dodil k3 \
search "health probe query" --bucket "$K3_BUCKET" --top-k 5If no results:
- Verify collection exists.
- Verify ingest jobs completed.
- Lower
--min-scoreand retest.
5. Table Drain and Maintenance Checks
dodil k3 \
table describe events --bucket "$K3_BUCKET"
dodil k3 \
table compact events --bucket "$K3_BUCKET"Focus on:
- WAL backlog and drain lag in describe output
- post-compaction stability of query results
6. Frequent Failure Patterns
- Auth errors: token expired or wrong org header.
- Object upload failure: missing
https://in API endpoint. - Ingest idle: discovery not run, or source/rule mismatch.
- Table query errors: engine not enabled in bucket.
Back to workflow index