Workflow: Source to Ingest Pipeline Flow

This workflow builds a complete ingestion path: bucket -> source -> pipeline -> rule -> discovery -> ingestion jobs.

When To Use

First-time data onboarding into K3.
Rebuilding an ingest route in a new environment.
Validating that source and pipeline wiring is correct.

Step 1: Create bucket


dodil k3 \
  bucket create "$K3_BUCKET" --description "Workflow bucket"

Step 2: Create source

CLI minimum:


dodil k3 \
  source create docs-source --bucket "$K3_BUCKET"

If you need provider/root path/schedule in one request, use API:


curl -sS -X POST "https://k3.dev.dodil.io/$K3_BUCKET/sources" "${AUTH[@]}" "${JSON[@]}" \
  -d '{
    "bucket": "'$K3_BUCKET'",
    "provider": "SOURCE_PROVIDER_INTERNAL_S3",
    "name": "docs-source",
    "root_path": "incoming/",
    "sync_interval_seconds": 3600,
    "enabled": true
  }'

Capture source_id from output.

Step 3: Create pipeline


dodil k3 \
  pipeline create embed-docs --bucket "$K3_BUCKET" --scriptum object_embedding_index -o json

Capture pipeline_id.

Step 4: Create ingest rule


dodil k3 \
  ingest add docs-rule \
  --bucket "$K3_BUCKET" \
  --source <source_id> \
  --collection <pipeline_id> \
  --include "incoming/**/*.pdf"

Note: --collection maps to API pipeline_id.

Step 5: Trigger discovery then ingestion


dodil k3 \
  ingest trigger-discovery --bucket "$K3_BUCKET" --source <source_id> --full-sync
 
dodil k3 \
  ingest trigger --bucket "$K3_BUCKET" --source <source_id>

Step 6: Observe jobs


dodil k3 \
  ingest jobs --bucket "$K3_BUCKET" -o json

Expected:

jobs move through PENDING -> PROCESSING -> COMPLETED
if transient failures happen, status can show RETRYING

Common Troubleshooting

No jobs dispatched: verify source object availability and rule include patterns.
Jobs fail immediately: verify pipeline/template exists and is accessible.
Reprocessing needed: run ingest trigger --retry-failed.

Next: Vector Search Lifecycle