Financial services are heavily data driven. Every day, they must capture and analyze extremely large amounts of information—making the speed and scalability of their analytics environments of paramount importance.
This customer is a multinational financial services corporation and member of the S&P 100, with hundreds of billions of dollars in annual revenue. Every day, the company must capture millions of transactions and analyze petabytes of data across its partner and co-branding programs, merchant services, reward points programs, and more. Thousands of analysts across the company require immediate access to this data throughout the day—via both ad-hoc, interactive queries and prebuilt reports.
To support those analysis needs, the company maintains a data lake containing several petabytes of detailed transaction data, along with multiple data warehouses that map to its various lines of business. And like most financial services companies, its top requirements for these systems include speed, scalability, and price-performance. These are the same reasons why the company chose Yellowbrick for data lake augmentation and data warehouse modernization.
The company’s first look at Yellowbrick was when it began looking to augment its MapR-based data lake with a modern data warehouse—the primary goal being faster ad-hoc query performance. At the time, the company was using Hive for reporting, but, as data-driven businesses across all industries have come to realize, this architecture—a SQL query layer on top of a data lake—couldn’t support thousands of analysts running interactive, ad-hoc queries on large data sets.
Instead, this customer envisioned a better approach: continue leveraging what its data lake did well—cost-effective storage for petabytes of raw data—and augment it by moving the data it contains into a modern data warehouse capable of delivering far greater speed, scalability, and price-performance.
In evaluating its options for data lake augmentation, the company compared a 15-node, 6-U Yellowbrick system against a 20-rack Teradata 6800. The company loaded two years of historical data from MapR into each system and then evaluated their performance across a set of 240 queries—with Yellowbrick beating the Teradata system on all but two.
Shortly thereafter, the company purchased its first Yellowbrick system. Every new transaction across the company’s lines of business is loaded into Yellowbrick, which, thanks to its 5 PB capacity, is able to store five full years of transaction data. Even better, thanks to Yellowbrick’s unique architecture, all of this data remains “hot” at all times, making it instantly queryable for the system’s thousands of daily users.
Following its successful initial deployment of Yellowbrick, the company began looking at where else it could leverage Yellowbrick’s unparalleled speed, capacity, and overall price-performance. Additional Yellowbrick deployments have included the following use cases:
Today, the company is actively working with Yellowbrick on several new initiatives, including one for fraud detection. In addition, although the company’s current use of Yellowbrick has been entirely on-premises, it plans to investigate how it can take advantage of Yellowbrick’s unique hybrid-cloud architecture to run its analytics workloads wherever it makes the most sense: on-premises, in a private cloud, in the public cloud, or any combination thereof—with the same predictable price-performance.
Through its use of Yellowbrick, this customer is benefiting in several ways.
The company’s continued adoption of Yellowbrick across new parts of its business is clear proof of the value that Yellowbrick is delivering—and how it can drive new business value across both data lake augmentation and data warehouse modernization scenarios.
Industry: Financial Services
A multinational financial services company and member of the S&P 100.
The company’s Hive-on-MapR data lake couldn’t support its thousands of users, who complained about slow response times for their ad-hoc, interactive queries. In addition, the company needed better performance from its aging Netezza data warehouse environment.
The company augmented its data lake by moving the data it contains into a modern data warehouse based on Yellowbrick—and has since deployed several additional Yellowbrick systems to modernize existing Netezza and DB2 systems.