Multidimensional data cubes (cubes) are commonly used in financial and operational analytics. Many Vendors and CFO’s are touting its benefits. Yet, we have not found good articles that explain the basic concepts of cubes. In this blog post, we will cover the basics of cubes through the use case of a grocery chain.
A brief history
The fancy name for a multi-dimensional cube is an OLAP database. OLAP – Online Analytical Processing, as the name suggests is useful for analytics. OLAP databases were different than relational databases such as SQL. SQL was good for storing and retrieving small amounts of transactional data faster. OLAP was better at slicing and dicing large amounts of data to create reports.
Today, many databases have optimized their SQL queries. As a result, the speed of data extraction is not as much of an issue. For example, Amazon Redshift, Google Big Query and Snowflake databases are designed for querying. Even standard database providers such as MySQL have improved their query performance.
So what are the benefits of cubes? One of the key advantages of cube architecture is the simplicity in slicing and dicing the data. Tools such as Pivot Tables built for cubes have made this easy.
Example Case Study
Let’s talk through it using a simple problem statement. Consider the example of a grocery chain in many locations. Each store is collecting sales data for many product lines. See the figure below.
In this case, the two dimensions are products and time as illustrated in the figure below.
We can imagine that the same data may need to be collected at every location. Hence, location can also become a dimension as illustrated below.
For planning, we have to collect the same data for many scenarios such as budget, actuals, and forecast. Hence Scenarios will become another dimension. As illustrated below.
We will also collect data for Expenses. Expenses are another measure. The expense measure will share dimensions with the Sales measure.
Once you have the data stored in a cube, you can slice and dice the data in many ways. For example, scenario (Dimension 4) can be compared with Product (Dimension 1). As illustrated below.
Scenario (Dimension 4) can be compared with Locations (Dimension 3)
Both sales and expenses can be compared across Product (Dimension 1) and Locations (Dimension 3).
You might have already noticed the similarities of Pivot Tables to the figures above. The above figures were created in Pivot Tables in Excel. Pivot Tables actually originated for multi-dimensional data analysis. We recommend the Coursera Data Warehouse course for further details. They even have a free way to audit the courses to test it out.
Cloud Cube Application
As shown above, cubes provide a level of slicing and dicing not possible in spreadsheets. Let us see how cubes can become even more powerful under a cloud application. With minimal initial work, a cube can be designed, templates can be created and user data entry can be tracked. We can build-in alerts for notifying people to stay on top of tasks.
Imagine if you can automatically get reports on your store footfall on a daily basis and your sales and expenses on a weekly basis without any intervention from your team. You and your team members can be spending time with detailed analytics and subjective data gathering to get a pulse on the market and stay ahead of the competition.
PivotXL is a cloud cube application that can allow finance admins to create cubes in a web application and link Excel spreadsheets to it.