image.png

Architecture Type Definition Suitable Data Types Use Cases
Data Warehouse Centralized repository that stores structured data from multiple sources for analysis and reporting Structured data from operational systems, business applications, and databases Business intelligence, historical analysis, enterprise reporting
Data Lake Storage repository that holds raw data in its native format until needed Structured, semi-structured, and unstructured data (logs, images, videos, social media, IoT data) Big data analytics, machine learning, data discovery
Data Lakehouse Hybrid architecture combining data lake storage with data warehouse capabilities Both structured and unstructured data with schema enforcement when needed Real-time analytics, unified data platform, combined ML and BI workloads
Data Mesh Decentralized architecture treating data as a product owned by domain teams Domain-specific data products with standardized interfaces Large organizations with diverse data domains, distributed ownership
Data Fabric Integrated layer of technologies and services providing consistent capabilities across environments Enterprise-wide data from multiple sources requiring integration Cross-platform data integration, metadata management, governance
Lambda Architecture Data processing architecture with batch and stream processing paths High-velocity data requiring both real-time and batch processing Real-time analytics with historical context, IoT applications

Data Warehouse Types:

Inmon Style : Source > Staging ( where Raw data lands ) > Enterprise Datawarehourse “ 3rd Normal Format “ is how we normalize and structure out data > Data Marts “ subset of the data that transformed to be consumed for reporting > BI tool

Kimball : Source > Staging > Directly to Data Marts > BI Tool “ Faster than Inmon approach but redundancy is everywhere “

Data Vault : Source > Staging > Raw Vault “ Where data is still raw “ > Business Vault “ where we apply business logic and transformations “ > Data Marts > BI Tools

Medallion Architecture : Source > Bronze Layer “ Where we have the data Raw which helps us find issues > Silver Layer “ We apply transformation and data cleansing but no business rules yet “ >

Gold Layer “ Where we can built objects not only for reporting but also for Machine Learning and AI “ > BI Tools

image.png

Since we chose the Medallion Architecture, here’s a diagram of it summarizing each stage and its characteristics :

image.png

P.S : Separation of concerns

image.png

Separation of concerns that we design every layer totally independent and it takes full charge of the tasks assigned to it example : if we clean data in silver layer, we handle every cleaning aspect in silver layer and don’t pass data to another layer to clean it wither its bronze or gold

Data_Architecture_Design.jpg

Here, drawio is used to draw the diagram in a modern way to have a visual que for the whole design and what are our goals and how we initialize each layer based on what