Getting Started

Your First Pipeline in 5 Minutes

Connect a data source, build a medallion pipeline, and schedule it all from the browser. No infrastructure to manage, no YAML to write.

Quick Start

1
Sign Up & Set Up
Create your organization and workspace through the onboarding wizard
2
Connect a Source
Link GCS, BigQuery, or a REST API to your workspace
3
Build a Pipeline
Use the visual canvas or let Concierge AI generate one for you
4
Run & Schedule
Execute your pipeline and set an Airflow schedule
1

Sign Up & Set Up Your Workspace

The onboarding wizard walks you through creating your organization and workspace in four steps. Everything is collected up front and submitted together at the end.

1

Sign in at optimaflo.io/sign-in with Google, Microsoft, Amazon, or email

2

Organization: name your organization, set a URL slug, and select your industry and team size

3

Workspace: create one or more workspaces (up to 5 during onboarding). Separate by environment, team, or project.

4

BYOC: configure your Bring Your Own Cloud deployment on GCP. Select your infrastructure tier and region, then the platform provisions everything. AWS and Azure support is coming soon.

Each workspace gets its own Apache Iceberg catalog, so data from different workspaces is fully isolated — even within the same organization.

2

Connect a Data Source

OptimaFlo currently supports Google Cloud Storage, BigQuery, and REST API connectors, with more on the way. The Data Source AI agent guides you through the entire process conversationally; tell it what you want to connect and it handles authentication, browsing, file selection, validation, and schema inference.

Available Today

  • Google Cloud Storage
  • BigQuery
  • REST API (any endpoint)

Coming Soon

  • Amazon S3
  • Redshift
  • Snowflake
  • PostgreSQL
  • MySQL
  • GraphQL
1

Open Data Sources in the sidebar and click Add Source — this opens the Data Source AI agent

2

Tell the agent what you want to connect (e.g. "Connect my GCS bucket gs://company-data") — it authenticates via OAuth for cloud sources or asks for credentials for databases

3

The agent browses your buckets, folders, or tables and lets you select specific files or datasets to ingest

4

It validates the connection, infers your schema, and creates the data source record — ready for use in a pipeline

For GCS and BigQuery, authentication happens via a Google OAuth popup — no service account keys to manage. The platform handles token refresh automatically.

3

Build Your Pipeline

Three ways to build: drag-and-drop on the visual canvas, use the Pipeline Generator AI to create a pipeline from a prompt, or let Concierge AI handle the entire workflow end-to-end.

Visual Canvas
Drag nodes onto the canvas and connect them. Configure SQL for each transform node. Preview results before saving.
  • Add a Ingestion node linked to your data source
  • Add Cleaning and Aggregation nodes for transformations
  • Connect them with edges to define data flow
  • Write SQL or use the SQL Copilot to generate it
Pipeline Generator AI
Describe your transformations and the generator builds the full canvas — Source, Ingestion, Cleaning, and Aggregation nodes with SQL already written.
  • Open the Generator panel in the canvas sidebar
  • Describe what you want to transform
  • Review the generated nodes and SQL
  • Apply to canvas with one click
Concierge AI
End-to-end orchestration. Concierge connects sources, generates SQL, builds the canvas, and deploys the pipeline — all from one conversation.

"Connect my GCS bucket gs://sales-data, clean the CSVs, deduplicate on order_id, and create a monthly revenue summary."

4

Run and Schedule

Execute your pipeline manually or set a schedule. OptimaFlo generates an Apache Airflow DAG behind the scenes; you never touch Airflow config directly.

1

Click Execute to run the pipeline immediately. The platform auto-saves before executing

2

Monitor progress in the execution panel and watch each layer complete from Ingestion through Aggregation

3

Open Settings to set a schedule (hourly, daily, weekly, monthly, quarterly, or yearly). The platform converts your selection to an Airflow cron

4

Use Backfills to re-process historical date ranges when you change transformation logic

Backfills run sequentially by default to avoid Iceberg write conflicts. You can increase parallelism for independent tables.

Core Concepts: Ingestion, Cleaning, Aggregation

The medallion architecture is the backbone of every OptimaFlo pipeline.

Ingestion

Raw Data

Your source data lands here untouched. Every record is preserved in Apache Iceberg tables with full history, ACID transactions, and time-travel.

  • Zero transformations — exact copy of the source
  • Schema detected automatically on ingestion
  • Full history retention for compliance and replay
  • Partitioned for query performance

Cleaning

Cleaned Data

Cleaned, deduplicated, and type-cast. SQL Copilot generates transformations from plain English, then you review and approve before anything runs.

  • LLM-generated SQL from natural language
  • Preview results before committing
  • Deduplication, null handling, and type casting
  • Validated and schema-enforced before execution

Aggregation

Business Metrics

Aggregated, business-ready metrics and star schemas. Feed dashboards, exports, and analyst queries from a single source of truth.

  • Aggregations and business KPIs
  • Star schema for analytics
  • Incremental updates to minimize compute
  • Direct connection to BI dashboards

Ready to build?

Create your first pipeline in minutes.

Connect a source, transform with SQL, and schedule with Airflow — all from the browser.

AI-native data platform. From raw data to business dashboards powered by Apache open standards, visual pipeline building, and AI agents that handle the heavy lifting.

© 2026 OptimaFlo. All rights reserved.

We value your privacy

We use cookies to enhance your browsing experience, serve personalized content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies. You can customize your preferences or learn more in our Cookie Policy and Privacy Policy.