Welcome to the Yellowbrick blog! We took some time to sit down with our CEO Neil Carson to get his perspective on the company, our product, and the overall data market.
We help businesses derive value from their data by giving them the flexibility to deploy analytics anywhere. Specifically we do this by building the Yellowbrick Data Warehouse.
It looks like a standard data warehouse and talks the same protocols that interoperate with other popular data warehouse protocols from products like IBM/Netezza, Teradata, and AWS Redshift.
You can deploy it anywhere from on-premises to colocation to private cloud, network edge and public cloud. Today you can deploy in all the first four areas above, and public cloud details are coming next year.
Today you can buy our software plus hardware solution, the Yellowbrick Data Warehouse Appliance to deploy in your data center, colocation center, private cloud, or edge network.
You will find Yellowbrick to be the most economical solution for data warehousing. By that, we mean the best price/performance of a reliable product in the industry. We welcome customers to contact us about pricing. Compared to traditional data warehousing solutions, Yellowbrick is usually significantly more affordable and can deliver 10x improvements in price/performance.
Data warehousing is an enormous piece of the overall data market.
In 2018, 451 Research identified the data market at nearly $100B, analytic data platforms (or data warehouses) at $23 billion, and reporting and analytics, which is almost entirely dependent on data warehouses at $25 billion.
We think about three areas of the data warehousing market, one dominant, one emerging, and one figuring itself out.
Enterprise Data Warehousing
First is the traditional data warehouse where the vast majority of the market is covered by the large vendors such as Oracle, Microsoft, IBM, Teradata and SAP.
A small portion of these solutions run vertically integrated stacks such as Oracle Financials or SAP ERP. Yellowbrick Data does not participate in that space today.
The rest of the market is wide open. And we have customers coming from multiple product areas across the leading 5 providers.
One is traditional databases being used for analytics that are reaching their scaling limits. This is most common with single server databases such as Microsoft SQL Server or Oracle where the lack of a built-in scale-out solution makes analytics of large amounts of data unwieldy. We alleviate the analytic pressure on operational databases and put those analytics in the purpose build Yellowbrick offering.
Another area is existing data warehouses now represented by cumbersome and aging technology. Common solutions in this area include IBM/Netezza, now an end-of-life product, and Teradata.
Across the enterprise choice of analytic database and data warehouse, we see common workloads that make business run
For companies, they are able to make the decisions that move their business forward
All of the industries and business critical decisions are enabled by enterprise data warehouses today but need to gather more data and analyse more of it to compete in a world undergoing digital transformation.
Cloud-only Data Warehousing
Another area that is rapidly emerging is cloud-only data warehouses. Specifically here I am referring to solutions that only run in one provider’s cloud, or solutions that are intended to run ONLY in the cloud.
These solutions work well for customers with applications or workloads that are transient, depend on large variance peak loads, and focus on web-based data creation. They also work well for customers who are not quite sure yet what may or may not become a business critical process.
So for a good cloud-only data warehouse, you will want one that you can also turn off and shrink.
This is not the Yellowbrick Data marketspace. While our solutions will run in the public cloud in the future, our focus is on applications and workloads that are mostly- or always-on, provide any answer at any time with a stringent service level agreement, and deliver on predictable cost and performance.
Frequently Yellowbrick Data solutions will make wide-ranging use of web based data from the cloud as well, but usually as an input to the always-on approach. For example, several of our customers are customers of Amazon, Snowflake and Yellowbrick at the same time for different use cases.
Hadoop-reliant Data Warehousing
The final area of data warehousing attempts made to leverage the underlying Hadoop Distributed File System as a data warehouse. It is becoming more apparent to the industry every day that Hadoop is a wonderful place to temporarily or indefinitely store data, but the data processing challenges on a Hadoop cluster are widely recounted.
This includes everything from the original MapReduce jobs written on top of Hadoop to the dozens if not hundreds of attempts to rescue data out of the data lake with a combination of Spark, or SQL-on-Hadoop, or any other similarly directed endeavor.
Hadoop or anything used as a data lake remains a great place to store data on ingest, perhaps short term, perhaps forever.
Yellowbrick Data can ingest data from data lakes, only ingest the relevant data required, and from there provide operational reporting and analytics to drive business value.
It is a mix, about half and half right now.
New analytic applications are built daily, and Yellowbrick wants to be the go to solution for data architects and engineers to discover new opportunities. Once application developers identify new insights, the path to production can be extremely short, as Yellowbrick Data includes robust enterprise capabilities and workload management.
We continue to see analytics on scale-up databases as an inevitable bottleneck for successful applications. Whether these be single node systems from commercial vendors or open-source solutions, traditional databases only handle analytics well at small scale, and the inherent single node limitation inhibits further scale. Yellowbrick is the perfect solution at that point.
Examples of greenfield applications include the hospitality industry and IoT-inspired solutions with customer loyalty cards and experiences throughout the resort. To be successful, these customer interactions need to be correlated with other parts the business, such as lifetime guest revenue and live device data such as facial recognition or bluetooth tracking. Single node, scale-up approaches simply cannot handle the data capacity required for today’s needs.
Further, many of these new greenfield analytic application opportunities do not fit the economics of cumbersome and aging data warehouse solutions from more traditional providers.
And legacy applications?
This runs the gamut across enterprise applications and represents one of our greatest strengths by adopting an ANSI SQL approach, and in our case inspired by PostgreSQL.
For example there are an entire world of ETL tools like Informatica, Syncsort, IBM InfoSphere DataStage, Attunity and others.
There is also an entire world of business intelligence tools and dashboards such as Tableau, MicroStrategy and more.
There are data mining and analytics tools like SAS, R, and Python.
All of these ecosystems, along with helping customers consolidate and modernize their data warehouse infrastructure is part of the existing application improvement wave.
Stay tuned for our next Q&A installment on Yellowbrick and the Cloud as well as our company history and vision.