Friday, 23 December 2016

Data Modeling

Data modeling is the process of documenting a complex software system design as an easily understood diagram, using text and symbols to represent the way data needs to flow. 

Model: Model is business representation of information in one or more data sources.
In OLTP data model is designed with the help of Entity Relation model. This ER Model can be modified during the ETL process to fit into the data warehousing project.
In data warehousing data modeling can be achieved with the help of dimensional modeling based on facts and measures
Data Modeling Types:-
Conceptual Data Modeling
Conceptual data model includes all major entities and relationships and but does not contain much detailed level of information about attributes and is often used in the INITIAL PLANNING PHASE

Conceptual data model is created by gathering business requirements from various sources like business documents, discussion with functional teams, business analysts, smart management experts and end users who do the reporting on the database. Data modelers create conceptual data model and forward that model to functional team for their review.

Conceptual data modeling gives an idea to the functional and technical team about how business requirements would be projected in the logical data model.

Logical Data Modeling
This is the actual implementation and extension of a conceptual data model. Logical data model includes all required entities, attributes, key groups, and relationships that represent business information and define business rules.

Physical Data Modeling
Physical data model includes all required tables, columns, data types, relationships, database properties for the physical implementation of databases. Database performance, indexing strategy, physical storage and demoralization are important parameters of a physical model.

The three levels of data modeling, conceptual data model, logical data model, and physical data model, were discussed in prior sections. Here we compare these three types of data models. The table below compares the different features:

Feature
Conceptual
Logical
Physical
Entity Names

Entity Relationships

Attributes


Primary Keys

Foreign Keys

Table Names


Column Names


Column Data Types


Below we show the conceptual, logical, and physical versions of a single data model.

Dimensional Data Modeling
·        Dimension model consists of fact and dimension tables
·        It is an approach to develop the schema DB designs

Dimensional data model is most often used in data warehousing systems. This is different from the 3rd normal form, commonly used for transactional (OLTP) type systems. As you can imagine, the same data would then be stored differently in a dimensional model than in a 3rd normal form model.
To understand dimensional data modeling, let's define some of the terms commonly used in this type of modeling:

Dimension: A category of information. For example, the time dimensions.

Attribute: A unique level within a dimension. For example, Month is an attribute in the Time Dimension.

Hierarchy: The specification of levels that represents relationship between different attributes within a dimension. For example, one possible hierarchy in the Time dimension is Year → Quarter → Month → Day.

Fact Table: A fact table is a table that contains the measures of interest. For example, sales amount would be such a measure. This measure is stored in the fact table with the appropriate granularity. For example, it can be sales amount by store by day. In this case, the fact table would contain three columns: A date column, a store column, and a sales amount column.

Lookup Table: The lookup table provides the detailed information about the attributes. For example, the lookup table for the Quarter attribute would include a list of all of the quarters available in the data warehouse. Each row (each quarter) may have several fields, one for the unique ID that identifies the quarter, and one or more additional fields that specifies how that particular quarter is represented on a report (for example, first quarter of 2001 may be represented as "Q1 2001" or "2001 Q1").
A dimensional model includes fact tables and lookup tables. Fact tables connect to one or more lookup tables, but fact tables do not have direct relationships to one another. Dimensions and hierarchies are represented by lookup tables. Attributes are the non-key columns in the lookup tables.

Data Modeling is a process of designing the data warehouse with its following types of schema
·         Star schema
·         Snowflake schema

No comments:

Post a Comment

Big Data FQA

1) What is a Big Data Architecture? Generally speaking, a Big Data Architecture is one which involves the storing and processing of data...