Data modeling is the process of documenting a
complex software system design as an easily understood diagram, using text and
symbols to represent the way data needs to
flow.
Model: Model
is business representation of information in one or more data sources.
In OLTP data model is designed with the help of Entity Relation
model. This ER Model can be modified during the ETL process to fit into
the data warehousing project.
In data warehousing data modeling can be achieved with the help of
dimensional modeling based on facts and measures
Data Modeling Types:-
Conceptual Data Modeling
Conceptual data model includes all
major entities and relationships and but does not contain much detailed level
of information about attributes and is often used in the INITIAL PLANNING PHASE
Conceptual data model is created
by gathering business requirements from various sources like business
documents, discussion with functional teams, business analysts, smart
management experts and end users who do the reporting on the database. Data
modelers create conceptual data model and forward that model to functional team
for their review.
Conceptual data modeling gives an
idea to the functional and technical team about how business requirements would
be projected in the logical data model.
Logical Data Modeling
This is the actual implementation
and extension of a conceptual data model. Logical data model includes all
required entities, attributes, key groups, and relationships that represent
business information and define business rules.
Physical Data Modeling
Physical data model includes all
required tables, columns, data types, relationships, database properties for
the physical implementation of databases. Database performance, indexing
strategy, physical storage and demoralization are important parameters of a
physical model.
The three levels of data modeling, conceptual data model, logical data model, and physical data model, were discussed in
prior sections. Here we compare these three types of data models. The table
below compares the different features:
Feature
|
Conceptual
|
Logical
|
Physical
|
Entity Names
|
✓
|
✓
|
|
Entity Relationships
|
✓
|
✓
|
|
Attributes
|
|
✓
|
|
Primary Keys
|
|
✓
|
✓
|
Foreign Keys
|
|
✓
|
✓
|
Table Names
|
|
|
✓
|
Column Names
|
|
|
✓
|
Column Data Types
|
|
|
✓
|
Below we show the conceptual, logical, and physical
versions of a single data model.
Dimensional Data Modeling
·
Dimension model consists of fact
and dimension tables
·
It is an approach to develop the
schema DB designs
Dimensional data model is most often used in
data warehousing systems. This is different from the 3rd normal form, commonly
used for transactional (OLTP) type systems. As you can imagine, the same data
would then be stored differently in a dimensional model than in a 3rd normal
form model.
To understand dimensional data modeling, let's define
some of the terms commonly used in this type of modeling:
Dimension: A category of
information. For example, the time dimensions.
Attribute: A unique level
within a dimension. For example, Month is an attribute in the Time Dimension.
Hierarchy: The specification of
levels that represents relationship between different attributes within a
dimension. For example, one possible hierarchy in the Time dimension is Year →
Quarter → Month → Day.
Fact Table: A fact table is a
table that contains the measures of interest. For example, sales amount would
be such a measure. This measure is stored in the fact table with the
appropriate granularity. For example, it can be sales amount by store by day.
In this case, the fact table would contain three columns: A date column, a
store column, and a sales amount column.
Lookup Table: The lookup table
provides the detailed information about the attributes. For example, the lookup
table for the Quarter attribute would include a list of all of the quarters
available in the data warehouse. Each row (each quarter) may have several
fields, one for the unique ID that identifies the quarter, and one or more
additional fields that specifies how that particular quarter is represented on
a report (for example, first quarter of 2001 may be represented as "Q1
2001" or "2001 Q1").
A dimensional model includes fact tables and lookup
tables. Fact tables connect to one or more lookup tables, but fact tables do
not have direct relationships to one another. Dimensions and hierarchies are
represented by lookup tables. Attributes are the non-key columns in the lookup
tables.
Data Modeling is
a process of designing the data warehouse with its following types of schema
·
Star schema
·
Snowflake schema