Scenario 1: Duplicate rows are present in relational database:
Suppose we have Duplicate records in Source System and we want to load only the unique records in the Target System eliminating the duplicate rows. What will be the approach?
Assuming that the source system is a Relational Database, to eliminate duplicate records, we can check the Distinct option of the Source Qualifier of the source table and load the target accordingly.
Scenario 2: Deleting duplicate records from flatfile:
Here since the source system is a Flat File you will not be able to select the distinct option in the source qualifier as it will be disabled due to flat file source table. Hence the next approach may be we use a Sorter Transformation and check the Distinct option. When we select the distinct option all the columns will the selected as keys, in ascending order by default.
Ready to start your tutorial with us? That's great! Send us an email and we will get back to you as soon as possible!