Documentation

Learn OptimaFlo

Guides and tutorials to help you go from raw data to clean dashboards; whether you're connecting your first source or deploying in your own cloud.

Browse by Topic

Everything from first setup to production deployment.

Getting Started
Create your first pipeline in under five minutes. Covers account setup, workspace creation, and core concepts.
  • Create your first pipeline
  • Workspace & project setup
  • Core concepts: Ingestion, Cleaning, Aggregation
  • Connecting your first data source
Platform Architecture
Understand the 8-layer architecture, layered data architecture data flow, and how DuckDB, BigQuery, and Spark auto-scale under the hood.
  • Layered Data Architecture overview
  • Engine selection: DuckDB vs BigQuery vs Spark
  • Apache Iceberg & Polaris catalog
  • Data lifecycle & time-travel
Connectors & Sources
Connect to GCS, BigQuery, REST APIs, and more. Schema inference and one-click OAuth included.
  • GCS file connector
  • REST API connector
  • BigQuery connector
  • Adding new data sources
Pipeline Builder
Drag-and-drop pipeline canvas, SQL node configuration, and scheduling with Airflow.
  • Visual canvas walkthrough
  • Node types & configuration
  • SQL generation & copilot
  • Scheduling & backfills
AI Agents
Eight purpose-built AI agents: Concierge, SQL Copilot, Analyst, Pipeline Generator, Dashboard Generation, Dashboard Copilot, Data Source AI, and Data Quality AI.
  • Concierge AI: end-to-end orchestration
  • SQL Copilot & Pipeline Generator
  • Analyst AI, Dashboard Generation & Copilot
  • Data Source AI & Data Quality AI
GCP
BYOC Deployment
Deploy OptimaFlo in your own cloud. Terraform-managed infrastructure with full data residency. Your data never leaves.
  • GCP setup guide
  • Required APIs & IAM roles
  • Terraform infrastructure
  • Architecture & security model
BI & Dashboards
Semantic layer, chart types, KPI tiles, dashboard creation, and embedding for stakeholder access.
  • Creating dashboards
  • Chart types & configuration
  • Semantic layer & metrics
  • Sharing & embedding
Data Quality
Quality scoring across five dimensions, schema enforcement, validation rules, and self-healing SQL across your pipeline.
  • Quality scores explained
  • Automated profiling
  • Validation rules
  • Self-healing SQL & schema enforcement

Built on Open Standards

OptimaFlo is built on Apache Iceberg, Apache Airflow, and Apache Polaris. No proprietary formats, no data lock-in. Your data stays yours, and these docs show you exactly how it all fits together.

Ready to build?

Go from raw data to dashboards in minutes.

Now in early beta. Plans from $2,500/mo. Deployed in your cloud. Your data never leaves.

AI-native data platform. From raw data to business dashboards powered by Apache open standards, visual pipeline building, and AI agents that handle the heavy lifting.

© 2026 OptimaFlo. All rights reserved.

We value your privacy

We use cookies to enhance your browsing experience, serve personalized content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies. You can customize your preferences or learn more in our Cookie Policy and Privacy Policy.