Skip to Content
We are live but in Staging 🎉
WorkflowsWorkflow: Source to Ingest Pipeline Flow

Workflow: Source to Ingest Pipeline Flow

This workflow builds a complete ingestion path: bucket -> source -> pipeline -> rule -> discovery -> ingestion jobs.

When To Use

  • First-time data onboarding into K3.
  • Rebuilding an ingest route in a new environment.
  • Validating that source and pipeline wiring is correct.

Step 1: Create bucket

dodil k3 \ bucket create "$K3_BUCKET" --description "Workflow bucket"

Step 2: Create source

CLI minimum:

dodil k3 \ source create docs-source --bucket "$K3_BUCKET"

If you need provider/root path/schedule in one request, use API:

curl -sS -X POST "https://k3.dev.dodil.io/$K3_BUCKET/sources" "${AUTH[@]}" "${JSON[@]}" \ -d '{ "bucket": "'$K3_BUCKET'", "provider": "SOURCE_PROVIDER_INTERNAL_S3", "name": "docs-source", "root_path": "incoming/", "sync_interval_seconds": 3600, "enabled": true }'

Capture source_id from output.

Step 3: Create pipeline

dodil k3 \ pipeline create embed-docs --bucket "$K3_BUCKET" --scriptum object_embedding_index -o json

Capture pipeline_id.

Step 4: Create ingest rule

dodil k3 \ ingest add docs-rule \ --bucket "$K3_BUCKET" \ --source <source_id> \ --collection <pipeline_id> \ --include "incoming/**/*.pdf"

Note: --collection maps to API pipeline_id.

Step 5: Trigger discovery then ingestion

dodil k3 \ ingest trigger-discovery --bucket "$K3_BUCKET" --source <source_id> --full-sync dodil k3 \ ingest trigger --bucket "$K3_BUCKET" --source <source_id>

Step 6: Observe jobs

dodil k3 \ ingest jobs --bucket "$K3_BUCKET" -o json

Expected:

  • jobs move through PENDING -> PROCESSING -> COMPLETED
  • if transient failures happen, status can show RETRYING

Common Troubleshooting

  1. No jobs dispatched: verify source object availability and rule include patterns.
  2. Jobs fail immediately: verify pipeline/template exists and is accessible.
  3. Reprocessing needed: run ingest trigger --retry-failed.

Next: Vector Search Lifecycle