Workflow: Source to Ingest Pipeline Flow
This workflow builds a complete ingestion path: bucket -> source -> pipeline -> rule -> discovery -> ingestion jobs.
When To Use
- First-time data onboarding into K3.
- Rebuilding an ingest route in a new environment.
- Validating that source and pipeline wiring is correct.
Step 1: Create bucket
dodil k3 \
bucket create "$K3_BUCKET" --description "Workflow bucket"Step 2: Create source
CLI minimum:
dodil k3 \
source create docs-source --bucket "$K3_BUCKET"If you need provider/root path/schedule in one request, use API:
curl -sS -X POST "https://k3.dev.dodil.io/$K3_BUCKET/sources" "${AUTH[@]}" "${JSON[@]}" \
-d '{
"bucket": "'$K3_BUCKET'",
"provider": "SOURCE_PROVIDER_INTERNAL_S3",
"name": "docs-source",
"root_path": "incoming/",
"sync_interval_seconds": 3600,
"enabled": true
}'Capture source_id from output.
Step 3: Create pipeline
dodil k3 \
pipeline create embed-docs --bucket "$K3_BUCKET" --scriptum object_embedding_index -o jsonCapture pipeline_id.
Step 4: Create ingest rule
dodil k3 \
ingest add docs-rule \
--bucket "$K3_BUCKET" \
--source <source_id> \
--collection <pipeline_id> \
--include "incoming/**/*.pdf"Note: --collection maps to API pipeline_id.
Step 5: Trigger discovery then ingestion
dodil k3 \
ingest trigger-discovery --bucket "$K3_BUCKET" --source <source_id> --full-sync
dodil k3 \
ingest trigger --bucket "$K3_BUCKET" --source <source_id>Step 6: Observe jobs
dodil k3 \
ingest jobs --bucket "$K3_BUCKET" -o jsonExpected:
- jobs move through
PENDING->PROCESSING->COMPLETED - if transient failures happen, status can show
RETRYING
Common Troubleshooting
- No jobs dispatched: verify source object availability and rule include patterns.
- Jobs fail immediately: verify pipeline/template exists and is accessible.
- Reprocessing needed: run
ingest trigger --retry-failed.
Next: Vector Search Lifecycle