Hello
I am currently running Exasol in a multi-cloud setup; with instances in both AWS & Azure. I would like to keep certain schemas synchronized across these environments to support analytics workloads that span cloud boundaries. 
While data loading & replication tools exist; I have not found clear guidance on how to keep schemas, including UDFs, views & permissions, consistently mirrored without manual export/import scripts. 
Ideally; I would like to automate schema synchronization as part of a CI/CD workflow but I am unsure whether Exasol supports native tooling for this / whether community practices favor external solutions (e.g., Flyway, Liquibase, custom scripts). 
Also; are there any gotchas related to Exasol’s internal metadata / cloud-specific configurations that I should consider when syncing between clouds? 
I checked Data Migration | Exasol DB Documentation documentation guide with what is dax in power bi guide for reference and found it helpful.
Has anyone implemented schema sync successfully across multi-cloud Exasol deployments? 
I would love to hear how you’ve approached schema drift, security context differences / even differences in cloud performance that affect this setup.
Thank you !! 
Hi @reyiyeb,
as you found out correctly, Exasol has a lot of import / export capabilities. We don’t offer an out-of-the-box solution for synchronizing an Exasol database from one cloud provider to the other.
I can see how having Exasol in two or more clouds would be a way to prevent cloud-provider lock-in.
For this to work there are a lot of things one would have to take into account:
- You need a reliable and secure cross-provider network connection (ideally a VPN)
- You absolutely need delta-loading, since cloud providers tend to charge generously for traffic that leaves their data centers
- Delta-loading requires a separate disaster-recovery strategy, since while delta-loading reduces traffic, it is the opposite of self-repairing
- You have double cloud monitoring efforts
Since you are the first person asking for this particular setup I am afraid we don’t have a ready-to-use approach here.
Your use case requires limiting traffic to an absolute minimum to keep the traffic costs acceptable, so that will by far be the hardest part of the challenge.
Could you describe in broad strokes the size of the database you have in mind and how much of added / modified data you expect over time?
Hi reyiyeb,
We run Exasol on-prem in multiple data-centers and we keep them synchronized by having our data pipeline write to both instances simultaneously. And then we have a nightly job (custom script) that copies all metadata (UDFs, functions, users, roles, permissions, etc) from one data-center to the other.
Peter
1 Like