IngestService API
Package: dodil.k3.ingest.v1
IngestService owns ingestion rules, source discovery/ingest triggers, and ingest job status.
What It Is For
- Define source matching rules and map them to pipelines.
- Trigger discovery scans and ingestion runs.
- Monitor ingest jobs and retry strategies.
Endpoint Map
Rules
| gRPC method | HTTP route |
|---|---|
CreateRule | POST /:bucket/rules |
ListRules | GET /:bucket/rules |
GetRule | GET /:bucket/rules/:rule_id |
UpdateRule | PATCH /:bucket/rules/:rule_id |
DeleteRule | DELETE /:bucket/rules/:rule_id |
Source Discovery/Sync
| gRPC method | HTTP route |
|---|---|
TriggerDiscovery | POST /:bucket/sources/:source_id/discover |
TriggerIngestion | POST /:bucket/sources/:source_id/ingest |
GetSyncStatus | GET /:bucket/sources/:source_id/sync |
Ingest Jobs
| gRPC method | HTTP route |
|---|---|
TriggerIngest | POST /:bucket/ingest |
ListIngestJobs | GET /:bucket/ingest/jobs |
GetIngestStatus | GET /:bucket/ingest/jobs/:job_id |
Key Arguments
Create rule
| Field | Type | Required | Purpose |
|---|---|---|---|
bucket | string | yes | Bucket scope |
source_id | string | yes | Source to match against |
pipeline_id | string | yes | Destination pipeline (canonical) |
name | string | yes | Rule name |
include_patterns | string[] | no | Glob include list |
exclude_patterns | string[] | no | Glob exclude list |
include_mime_types | string[] | no | MIME allow-list |
exclude_mime_types | string[] | no | MIME deny-list |
min_size_bytes | int64 | no | Minimum object size |
max_size_bytes | int64 | no | Maximum object size (0 = no limit) |
enabled | bool | no | Enable rule |
priority | int32 | no | Higher value matches first |
Important:
- Route decisions should use
pipeline_id. bindingin responses is derived display metadata, not routing authority.
Trigger discovery
| Field | Type | Required | Purpose |
|---|---|---|---|
bucket | string | yes | Bucket scope |
source_id | string | yes | Source to scan |
full_sync | bool | no | Ignore checkpoint and rescan |
rule_id | string | no | Scope discovery to one rule |
Trigger ingestion from source queue
| Field | Type | Required | Purpose |
|---|---|---|---|
bucket | string | yes | Bucket scope |
source_id | string | yes | Source to ingest |
rule_id | string | no | Restrict to one rule |
source_object_id | string | no | Restrict to one source object |
retry_failed | bool | no | Include failed/partial objects |
Trigger one object ingest
| Field | Type | Required | Purpose |
|---|---|---|---|
object.bucket | string | yes | Object bucket |
object.key | string | yes | Object key |
pipeline_id | string | no | Explicit pipeline override |
rule_id | string | no | Explicit rule override |
options | map<string,string> | no | Per-event option overlay |
Examples
Create an ingest rule
curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/rules" \
-H "Authorization: Bearer $K3_TOKEN" \
-H "x-organization-id: $K3_ORG" \
-H "Content-Type: application/json" \
-d '{
"bucket": "kb-prod",
"source_id": "src_123",
"pipeline_id": "pipe_456",
"name": "pdf-contracts",
"include_patterns": ["contracts/**/*.pdf"],
"enabled": true,
"priority": 100
}'Trigger full discovery
curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/sources/src_123/discover" \
-H "Authorization: Bearer $K3_TOKEN" \
-H "x-organization-id: $K3_ORG" \
-H "Content-Type: application/json" \
-d '{"bucket":"kb-prod","source_id":"src_123","full_sync":true}'Trigger ingestion and retry failed
curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/sources/src_123/ingest" \
-H "Authorization: Bearer $K3_TOKEN" \
-H "x-organization-id: $K3_ORG" \
-H "Content-Type: application/json" \
-d '{
"bucket": "kb-prod",
"source_id": "src_123",
"retry_failed": true
}'Trigger a one-shot ingest for one object
curl -sS -X POST "https://k3.dev.dodil.io/kb-prod/ingest" \
-H "Authorization: Bearer $K3_TOKEN" \
-H "x-organization-id: $K3_ORG" \
-H "Content-Type: application/json" \
-d '{
"object": {"bucket": "kb-prod", "key": "contracts/acme-2026.pdf"},
"pipeline_id": "pipe_456"
}'Common Use Cases
- Rule-based document routing by path, MIME, and size constraints.
- Controlled replay of failed documents after pipeline fixes.
- Single-document reprocess workflows for rapid QA.
Next: VectorService