Demystifying Hybrid Cloud Data Warehouses for the CDO

April 13, 2020industry

Data is the foundation of any meaningful corporate initiative. Fully master the necessary data needed and you’re more than halfway to initiative success. That’s why leveragable (i.e., multiple use) artifacts of the enterprise data environment are so critical to enterprise success.

Build them once (keep them updated), and use many, many times for many and diverse ends. The data warehouse remains primary to this goal, which is why, even now almost 40 years after the first database was built that was labeled a “data warehouse”, analytic database products still target the data warehouse, with some even being named a “data warehouse”.

A large part of the allure can be found in the cloud integrations of the products. The cloud is a disruptive technology, offering elastic scalability vis-à-vis on-premises deployments, enabling faster server deployment and application development, and allowing for less costly storage. For these reasons and more, many companies have leveraged the cloud to maintain, or gain, momentum as a company.

The pairing of analytical capability in the cloud with a data warehouse appliance or local cluster is making more sense for many companies. Fully realized, a hybrid cloud data warehouse approach will propel a company’s capabilities way ahead, but be forewarned, not all platforms leverage this breakout capacity to the fullest.

Deploying data warehouses in a hybrid cloud environment means:

Splitting the Data Warehouse and Analytical Workloads across Cloud and Existing Infrastructure

Most companies are multi-cloud. Of course it could be said that some, or most, of this is due to bottoms-up cloud selection, assumptions that all clouds are the same, vendor affinity, different skillsets across the enterprise and yes, some parts of the enterprise just wanting to be different from other ones.

More mature shops with the cloud are arriving at multi-cloud but for different reasons. And with a different, more effective, allocation. We are at the point where the major clouds are forking in their approaches. There are significant differences in the clouds, which favor one versus the other when deciding on workload allocation.

Also, it’s a fallacy to believe that going to the cloud requires less maturity than staying on-premises. On-premises has inherent advantages but importantly, cloud accumulation should occur in alignment with company maturity. A public-cloud-only strategy can be disadvantaged, especially when accompanied by company immaturity, by its inability to support high concurrency and can have cost challenges attaining consistent results with large data sets and petabytes of data. Toss this all together and we are seeing an all-time high in terms of complexity in making correct workload allocation decisions. Mature shops make the best workload allocation, knowing they have the depth of knowledge available to deal with multi-cloud and on-premises as appropriate.

Price and Performance Variability

Price and performance are critical points of interest when it comes to selecting a platform, because they ultimately impact total cost of ownership, value, and user satisfaction. Our analysis reveals some solutions to be very powerful in value compared to others.

Data professionals who used to be valued for tuning queries will henceforth be valued for tuning costs. Data warehouse pricing comes in a dizzying array of possibilities, such as per user (named or concurrent), per processor, per core, or actual usage (by processing time or cycles or by data stored or transferred). The move of on-premises software to the cloud is not straightforward from a licensing perspective. Vendors often view this as an opportunity to renegotiate. So be prepared.

Details, Details, Details

While it’s assumed that stepping into a hybrid cloud data warehouse means at least some of the production data warehouse will be in the public cloud, decisions must be made about staging areas, development and quality assurance, and disaster recovery environments. You might choose to move some workloads to the cloud for agility and keep others on-premises for performance and security. There may be a stepwise progression into having a hybrid cloud environment. Don’t underestimate the architectural details.

Understand the differences and communicate the plan for dialing in your multi-cloud strategy. And get moving!

William McKnight has advised many of the world's best-known organizations. His strategies form the information management plan for leading companies in various industries. He is a prolific author and a popular keynote speaker and trainer. He has performed dozens of benchmarks on leading database, data lake, streaming and data integration products. William is the #1 global influencer in data warehousing and master data management and he leads McKnight Consulting Group, which has placed on the Inc. 5000 list in 2018 and 2017.