Tuesday, 13 October 2015

Difference between normalization and demoralizations in data warehouse

Difference between normalization and demoralizations in data warehouse
3NF
Pro
  • save most storage of all modeling techniques
  • simple to understand
Con
  • bad performance with multiple joins
  • difficult to load (ETL)
  • changes are very difficult to handle
Star Schema
Pro
  • Simple design
  • fast queries
  • fits very well to olap models
  • many DBMS are optimized for queries on star schemas
Con
  • usually centralized on one fact table
  • higher storage usage due to de normalization
Normalization includes reducing redundancy, redundancy means datahttp://images.intellitxt.com/ast/adTypes/icon1.png repetition. By removing the repeated data from the table we can separate the redundant data and create another table that is we are separating repeated data from detail table and creating a master table. Like that we are separating the data. In this case we are not completely removing the redundant data from detail table we are maintain a common column in both the tables that is in master table & detail table. That is in master table common column is called primary key column, detail table common column is foreign key. This foreign key refers to primary key.

This is one stage normalization. Normalization has 3 steps 1normalform, 2, 3 normal from. In first normal form we separate redundant data from detail table and create the master tables.
2 normal from we maintain data consistency by creating primary key foreign key relations. In 3 normal from we remove any calculation columns from detail table.

This is normalization. Opposite of this above normal forms is de normalization.
That is we maintain redundant data in detail table in that case we maintain all the transaction data in one table .that is de normalization.

Normalization is done to remove redundancy (i.e. to remove duplicates) -----OLTP
Data warehousing contains denormalized data purposely since it’s used for analysis purpose. -----OLAP


Redundancy: repetition of data within a database, increasing the inefficiency of the possibility of errors in databases is called Redundancy.

In general, databases intended for on-line transaction processing (OLTP) are typically more normalized than databases intended for on-line analytical processing (OLAP).

No comments:

Post a Comment

Big Data FQA

1) What is a Big Data Architecture? Generally speaking, a Big Data Architecture is one which involves the storing and processing of data...