Tuesday, 13 October 2015

Update without Update Strategy for Better Session Performance

You might have come across an ETL scenario, where you need to update a huge table with few records and occasional inserts. The straight forward approach of using Lookup transformation to identify the Inserts, Update and Update Strategy to do the Insert or Update may not be right for this particular scenario, mainly because of the Lookup transformation may not perform better and start degrading as the lookup table size increases.

In this article let’s talk about a design, which can take care of the scenario we just spoke.
The Theory
When you configure an Informatica Power Center session, you have several options for handling database operations such as insert, update, delete. 
Specifying an Operation for All Rows
During session configuration, you can select a single database operation for all rows using the Treat Source Rows As setting from the 'Properties' tab of the session.
  1. Insert: - Treat all rows as inserts. 
  2. Delete: - Treat all rows as deletes.
  3. Update: - Treat all rows as updates. 
  4. Data Driven :- Integration Service follows instructions coded into Update Strategy flag rows for insert, delete, update, or reject. 
Specifying Operations for Individual Target Rows
Once you determine how to treat all rows in the session, you can also set options for individual rows, which gives additional control over how each rows behaves. Define these options in the Transformations view on mapping tab of the session properties. 
  1. Insert: - Select this option to insert a row into a target table. 
  2. Delete: - Select this option to delete a row from a table. 
  3. Update :- You have the following options in this situation: 
    • Update as Update: - Update each row flagged for update if it exists in the target table. 
    • Update as Insert: - Insert each row flagged for update. 
    • Update else Insert: - Update the row if it exists. Otherwise, insert it. 
  4. Truncate Table: - Select this option to truncate the target table before loading data.
Design and Implementation
Now we understand the properties we need to use for our design implementation.

We can create the mapping just like an 'INSERT' only mapping, with out LookUp, Update Strategy Transformation. During the session configuration lets set up the session properties such that the session will have the capability to both insert and update.

First set Treat Source Rows As property as shown in below image.


Now let’s set the properties for the target table as shown below.  Choose the properties Insert and Update else Insert.

That all we need to set up the session for update and insert without update strategy.

How to generate sequence numbers using expression transformation?
Solution:
In the expression transformation, create a variable port and increment it by 1. Then assign the variable port to an output port. In the expression transformation, the ports are:
V_count=V_count+1
O_count=V_count

Design a mapping to load the first 3 rows from a flat file into a target?
Solution:
You have to assign row numbers to each record. Generate the row numbers either using the expression transformation as mentioned above or use sequence generator transformation.
Then pass the output to filter transformation and specify the filter condition as O_count <=3


No comments:

Post a Comment

Big Data FQA

1) What is a Big Data Architecture? Generally speaking, a Big Data Architecture is one which involves the storing and processing of data...