Cross-plateform with Cluster

Hi,

I’m new to Starbust. I started used the free version last week to discover the tool and share what is possible with my company.

For my project, i use Snowflake, Google Cloud Storage and BigQuery. My goal is to do a federated query with this three plateforms. I manage to do that but i struggle quite a lot to obtain the final result.

I explain my problem :

  • My initial snowflake trial account was hosted with AWS on an EU region (Zurich) compatible with Starburst.
  • My Cloud storage and BigQuery on the same EU region (Belgium) that was compatible with Starburst.

The problem with the two things above is the impossibility to add the snowflake catalog to the same cluster than the cluster for my Cloud Storage and BigQuery. It’s in grey when i attempt to do this with the message “not in the same region as catalog”.

So what am i missing ? I suspect that the cross plateform is not possible but it’s seem weird because it’ll miss a lot of use cases, or also that the free version is limited.

Also another question, i see on the documentation an IP allowlist for each provider, at the begining i had my cloud storage and bigquery hosted on a region name that wasn’t on this list and it didn’t work so i moved to a region in this list. Is this also a limitation of the free version ? Because for example if a company has all their instances in a region that is not specified on this list Starbust will be useless because that will not work and i don’t think moving all instances will be optimal or for example for Snowflake it’s not even possible after the account is created.

To “resolve” this problem for my example project, i created an other trial snowflake account but this time hosted in GCP but in London (didn’t have the possibility to use Belgium like my other instances). I was thinking that this will not work like above but it worked succesfully and the cross-region this time is accepted and i can do a federated query with my three differents catalog in the same cluster.

I saw an almost identical problem with the topic “Cross Region Query” but my problem is more specifical i think.

Thank you for your help

Yes, it seems to me this is still the cross-region situation. As you can see in Starburst | Cross-region support, this is still marked as private preview. As I alluded to in Cross Region Query, I think this is probably mostly about controlling costs (yours and Starburst’s) at the moment.

Okay thanks for your reply @lester , so for the cross plateform (Snowflake with AWS and GCP on same cluster impossible in my case) it’s also a limitation from the free version due to the region or for a cost reason ? I don’t see this mentionned anywhere but from what i read on diferent websites it’s normally possible to make a query with all the sources even if they are hosted with different providers in different region.

I did get some information internally from our product folks. Seems you were right in the first place that we are limiting the cross-platform (we call it cross-cloud) connectivity from your GCP cluster to your AWS Snowflake instance. Seems as of right now, the only cross-cloud opportunity is with PostgreSQL. It also seems more work is forthcoming for other services.

This explains (just as you thought) why all worked when you created a Snowflake instance on GCP that your Galaxy GCP cluster could access. As for cross-region, it seems there is no inherent cross-region constraints on GCP for Snowflake or BigQuery as you saw in practice.

It seems much of the cross-region constraints are tied to object stores and if you encountered any other object store cross-region issues, click on the in-app chat of Galaxy and ask for the CROSS_REGION_OBJECT_STORE “feature flag”.