Join Starburst on March 18th for the next iteration of our Live Demo Series

Why Apache Iceberg is gaining popularity in Japan

  • Yuya Ebihara

    Yuya Ebihara

    Software Engineer

    Starburst

Share

Linkedin iconFacebook iconTwitter icon

Apache Iceberg has grown tremendously in recent years, particularly over the last 12 months. There’s a simple reason for this: the data lakehouse table format solves the problems facing data architecture today–whether that involves data analytics, Artificial Intelligence (AI), or data applications

This is very much Iceberg’s moment in the data community, and that appeal is growing. 

Apache Iceberg’s appeal is global

Importantly, that moment is also global. Iceberg isn’t just becoming central in one place, it’s becoming central everywhere. For example, recently, the Iceberg Japan community hosted its first meetup. We had five exciting sessions covering topics like a deep dive into the Iceberg V3 spec, conflict resolution using Iceberg, real-world large-scale use cases, an introduction to Snowflake and Iceberg, and insights into Databricks Unity Catalog

You can check out the slide decks from each session here:

Data architecture in Japan

Overall, this signals Iceberg’s growing importance in Japan. There are several reasons for this. Some of them are the same reasons that people everywhere like Iceberg–it solves their data problems—and others are more unique to the Japanese market specifically. 

Japan and hybrid data lakehouse architecture 

On-premises data is very important in Japan. Often this is due to security, compliance, or regulatory concerns. At the same time, cloud computing in Japan is growing. For this reason, many Japanese companies prefer a hybrid cloud approach. A hybrid approach allows organizations to retain personal data on-premises while storing other data in the cloud. 

The country’s IT landscape is shaped by a strong presence of legacy systems, strict regulations, and a growing—but cautious—adoption of cloud technologies. As a result, many enterprises rely on a hybrid model, balancing on-premise infrastructure with public cloud services.

Regardless of the architecture, one major challenge remains: data silos.

 

Apache Iceberg in Japan

Overall, you can think of two main use cases for Iceberg in Japan.

The first is migrating from the Hive table format to Iceberg, as Tasuku shared in their case study at LY Corporation. Hive has several limitations, including a lack of ACID transactions and schema evolution. In this scenario, there is also heavy load on the Hive Thrift Metastore, and slow query performance due to missing file-level statistics. To overcome these challenges, many large-scale Hive users are adopting Iceberg.

The second use case is avoiding vendor lock-in. Since Iceberg’s table format is self-descriptive, it allows seamless sharing across different query engines. This is especially important for Japanese companies, which often use multiple cloud services—for example, Snowflake on AWS or Databricks on AWS—and need a flexible, interoperable data architecture.

 

Starburst in Japan

Starburst sponsored the event because we love Apache Iceberg and are committed to investing in the Iceberg community in Japan.

One key aspect of Iceberg is maintenance, which prevents issues like small files and orphaned files. Starburst automates this process, so Iceberg users can focus on their analytics without worrying about maintenance tasks.

Additionally, Starburst enables seamless analytics across different data formats and storage systems—whether it’s RDBMS and Iceberg or AWS and on-premise—without the need to move data.

I have no doubt that the future of Starburst + Iceberg + Japan is only just getting started.