
For data teams running Kafka at scale, the path from raw streams to analytics-ready data has never been straightforward. Custom pipelines, fragile connectors, and manual schema management create operational overhead that adds up quickly, pulling engineering resources away from work that actually moves the business forward.
Today, Starburst and StreamNative are announcing a partnership that changes that equation. By combining StreamNative’s new native Kafka Service with Starburst Managed Ingestion, we’re delivering a complete, production-grade path from real-time data streams to queryable Apache Iceberg tables and from there, to immediate, high-performance analytics across your entire data estate.
Two platforms, one complete workflow
StreamNative brings together the best of both worlds by supporting Apache Kafka and Apache Pulsar on a shared, lakehouse-native foundation powered by Ursa (The First Lakehouse-Native Streaming Engine for Kafka). Using this approach, teams can build with the streaming platform they already know and trust, while benefiting from a unified architecture that seamlessly connects real-time data to the lakehouse.
What does this mean?
With Ursa as the common data engine, StreamNative eliminates silos between streaming and analytics, enabling organizations to move from data ingestion to AI and insights in a single, continuous workflow.
What this means for you:
- Platform flexibility – Choose Kafka or Pulsar without architectural trade-offs
- Unified data foundation – One storage and semantics layer across all streaming workloads
- End-to-end data flow – From real-time streams to lakehouse analytics and AI, without friction
Understanding StreamNative’s Native Kafka Service
StreamNative’s Native Kafka Service, in Public Preview, is a cloud-native Kafka offering built on a diskless architecture. It delivers the performance and API compatibility that Kafka users expect, while eliminating much of the operational complexity that comes with traditional Kafka deployments. No disk provisioning, no capacity guesswork, and a streamlined experience purpose-built for cloud environments.
Understanding Starburst Managed Ingestion
Starburst Managed Ingestion establishes a handshake with the Kafka stream and powers the ingestion pipeline. Connect it to a Kafka topic, and it automates the entire landing process, writing data to Apache Iceberg tables with built-in schema management, optimized partitioning, and continuous table maintenance. For teams working with Avro-encoded topics, Starburst integrates directly with the StreamNative Schema Registry, validating messages against registered schemas at ingest time and enforcing compatibility rules before data ever lands in your Iceberg tables. This means pipelines stay stable and governed as schemas evolve, without manual intervention or silent ingestion failures. The result is Iceberg tables that are structured, performant, and immediately ready for SQL analytics the moment data arrives.
Examples of how this works in production
Once data is in Iceberg, the Starburst analytics engine, built on Trino, the industry’s most high-performance federated SQL engine, makes it immediately operational. Query your freshly ingested streaming data directly at scale without moving or copying it, and join it in real time against the rest of your data estate through a single SQL interface.
The end-to-end workflow is straightforward:
Before going public with this partnership, both teams validated this workflow end-to-end. The integration connected cleanly, data landed correctly, and tables were immediately queryable, confirming this is a production-ready path.
Why Apache Iceberg is the right destination for most data
Apache Iceberg has emerged as the open standard for analytical tables in the modern lakehouse. Supported across every major query engine, cloud provider, and data platform, it provides ACID transactions, schema evolution, time travel, and partition pruning on storage you own, without binding you to a single vendor.
The pairing of Kafka and Iceberg is natural. Kafka handles data in motion, while Iceberg handles data at rest. What has historically been difficult is bridging the two in a way that is reliable, scalable, and does not require dedicated pipeline engineering to sustain. That is the gap this partnership is designed to close.
Choosing open formats like Iceberg is also a compounding decision. The more of your data lands in Iceberg today, the more flexibility you retain as your architecture evolves, and the more of that data becomes accessible to the Starburst engine for federated analytics across your broader data landscape.
Moving from ingestion to operationalization
The true measure of any ingestion platform is not just how reliably data lands, but how quickly it becomes useful. Starburst Managed Ingestion and the Starburst analytics engine work together as a single, coherent system to ensure that the answer is: immediately.
Enterprise-Grade Ingestion at Scale
Starburst Managed Ingestion is purpose-built for the Iceberg Lakehouse. It processes records as they arrive from Kafka, bypassing the latency and overhead of batch-oriented pipelines. Data is parsed, validated, and written to Iceberg using append-optimized paths that preserve schema and snapshot integrity, with built-in exactly-once delivery and near real-time latency. As demonstrated in our most recent independent benchmarking report, Starburst delivers superior Iceberg ingestion performance compared to competitors, with approximately 7x higher record ingestion rates at significantly lower cost.
This is not infrastructure you need to stand up and manage. Starburst runs the ingestion engine on a serverless architecture, scaling compute instantly to match fluctuating data volumes and charging only for capacity actually consumed.
Automated Table Maintenance
Iceberg tables require continuous upkeep to remain performant. As data accumulates, small files from streaming writes, orphaned metadata, and stale snapshots degrade query performance over time if left unmanaged. Starburst handles all of this automatically, running intelligent compaction, snapshot expiration, and orphan file removal as a fully managed background service. Your tables stay fast and query-ready without your team writing a single maintenance script.
Immediate Operationalization with Trino
Once data lands, the Starburst analytics engine, built on Trino, makes it available for high-performance SQL analytics immediately. Trino was designed for distributed query execution at scale, capable of processing billions of rows across multiple data sources simultaneously without requiring data to be moved or centralized first.
In practice, this means analysts can run SQL queries that join live StreamNative Kafka data against historical records in a cloud data warehouse, reference tables in an operational database, or metrics from a SaaS source, all in a single query with no ETL in between. The streaming data becomes operational the moment it lands, not after an overnight batch job.
Rather than maintaining separate pipelines to synchronize data before it can be analyzed, teams can query across sources directly in real time using the SQL they already know.
Building an Open, AI-Ready Foundation
The value of this integration extends beyond operational efficiency. As organizations build AI and machine learning workflows on top of their data, the freshness, structure, and accessibility of that data become a competitive differentiator. Streaming data that lands in well-maintained Iceberg tables is available immediately, not just for dashboards and reporting, but as live input for real-time AI applications. With Starburst’s federated query layer, that data can also be enriched with context from across the enterprise before it ever reaches a model.
Jitender Aswani, SVP of Engineering at Starburst, puts it this way:
“StreamNative and Starburst bring together real-time streaming and seamless Iceberg ingestion to simplify how data moves and becomes usable. Together, we create an open, AI-ready foundation. With Starburst, organizations can query that data directly at scale for real-time analytics and AI.”
This is the broader ambition behind the partnership: helping organizations build on open standards, move quickly with real-time data, and create infrastructure that is ready for whatever comes next, including AI workloads that depend on fast, reliable, and accessible data at scale.
Who is this built for?
This solution is well-suited for teams running Kafka today who need a reliable, managed path to Iceberg without building and maintaining custom ingestion pipelines. It is equally relevant for organizations designing new streaming architectures who want the full lakehouse story, including ingestion, automated table maintenance, and high-performance federated analytics, built in from the start rather than bolted on later.
If your team is spending meaningful engineering time keeping ingestion pipelines healthy rather than building on top of them, or if your analysts are waiting on data that should already be queryable, this is worth a close look.
Get started with Starburst Galaxy
StreamNative’s Native Kafka Service is available in Public Preview starting today. Starburst Managed Ingestion and the Starburst analytics engine are available on Starburst Galaxy.



