Structural Differences Between OLTP and OLAP Systems

Jul 24, 20243 min read

In one of my previous articles, we discussed the main differences between OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) systems, including their purposes, use cases, and how they are often used together within the same ecosystems (if you missed it, here’s the link). In this article, we will focus on the actual structural differences between the two systems and how data is transformed from OLTP to OLAP. Two key structural differences between OLTP and OLAP systems are:

Entity-Centric Data Grouping vs. Schema-Based Data Grouping
Separation of Measures and Dimensions

Entity-Centric Data Grouping vs. Schema-Based Data Grouping

In an OLTP system, data is organized around specific entities (such as customers, products, or orders). Each table typically corresponds to an entity and contains fields relevant to that entity, focusing on normalized data structures to minimize redundancy. In contrast, OLAP systems use schema-based organization, structuring data according to specific schemas designed to facilitate complex queries and analysis. This means that data from multiple entities might be grouped together in a way that depends on the level of normalization typical of the adopted schema (click here for more information about the most common schemas: Star, Snowflake and Galaxy).

Separation of Measures and Dimensions

Another important consideration is about Measures and Dimensions separation: In OLAP systems, measures and dimensions are typically stored separately to optimize data analysis. Measures are stored in fact tables, which contain the quantitative data for analysis, such as sales figures, revenue, or counts. Dimensions, on the other hand, are stored in separate dimension tables, which provide contextual information like product names, time periods, and geographic locations.

For example, the figure above is a typical representation of an OLTP system which has the following characteristics:

1. Fully normalized (it has been created one table for each entity — like products table and categories table)

2. Measures and dimensions coexist in the same table (ex. Products Name and Standard Price in the products table).

This other figure instead, is a typical representation of a star schema with the following characteristics:

1. as start schemas are denormalized, some tables have been grouped together to reduce the number of joins in the queries and speed up the data retrieval (ex. Products table and Categories tables have been grouped together)

2. all measures have been group together in a single table, also called Fact tables, which is at the centre of the schema; all other tables contains just non numeric data also called dimensions which provides contextual information

Another two important distinctions between the two systems are Time Dimensions and Data Granularity.

Time Dimension

Although dates are considered “dimensions,” they are not stored directly in the original dimension tables or in the fact tables. Instead, they are stored in a dedicated table called the Time Dimension or Date Dimension. This table typically contains not only dates but also additional fields related to those dates, such as the name of the day of the week, day number, week number, month number, year, fiscal week, and more. These additional fields are very powerful when used as filters in BI tools, as they allow users to aggregate data at different levels of granularity, providing greater flexibility in data analysis.

Data Granularity

While OLTP systems focus on transaction-level data where each record represents a specific event or transaction with high detail and granularity, OLAP tables may store data at various levels of granularity, ranging from highly detailed transactional data to aggregated summary data. The main purpose of having tables storing data at different levels of granularity is not primarily for flexibility in data analysis (as the Time Dimension discussed priviously effectively serves that purpose) but rather to improve performance. For example, fact tables storing monthly aggregated data have significant less data of a table storing daily aggregated data, thereby enhancing query performance during analysis executions.

Conclusion

Understanding the structural differences between OLTP and OLAP systems is crucial for designing an effective ETL (Extract, Transform, Load) process. This knowledge ensures that data is accurately transformed and optimized for analytical purposes in the OLAP environment. A well-implemented OLAP system, with properly organized schema-based structures and efficient data aggregation, is essential for conducting comprehensive data analysis, including Business Intelligence (BI) analysis. By leveraging these systems appropriately, organizations can enhance their data processing capabilities and derive meaningful insights, supporting strategic decision-making and driving business success.

Thanks for being a part of our community!

If you found this article helpful and would like to show your support, don’t hesitate to:

Clap on this story
Leave a comment below telling me what you think. This will help me with the next articles
Support my work on Buy Me a Coffee ☕️

These actions really really help me out, and are much appreciated!

Follow me for more insights on LinkedIn | YouTube | Medium