Home > Software > Data-Warehouse > DataStage
Interview Questions   Tutorials   Discussions   Programs   Discussion   

DataStage - How to remove duplicates in transformer stage in parallel mode?




829
views
asked mar September 20, 2014 06:32 AM  

How to remove duplicates in transformer stage in parallel mode?


           

1 Answers



 
answered By vishnoiprem   0  
Remove Duplicates stage: Partitioning on input links

  • The Partitioning tab allows you to specify details about how the incoming data is partitioned or collected before the operation is performed.
  • By default the stage uses the auto partitioning method.
  • If the Remove Duplicates stage is operating in sequential mode, it will first collect the data before writing it to the file using the default auto collection method.
  • The Partitioning tab allows you to override this default behavior. The exact operation of this tab depends on:
  • Whether the Remove Duplicates stage is set to execute in parallel or sequential mode.
  • Whether the preceding stage in the job is set to execute in parallel or sequential mode.
  • If the Remove Duplicates stage is set to execute in parallel, then you can set a partitioning method by selecting from the Partition type drop-down list. This will override any current partitioning.
  • If the Remove Duplicates stage is set to execute in sequential mode, but the preceding stage is executing in parallel, then you can set a collection method from the Collector type drop-down list.
flag   
   add comment

Your answer

Join with account you already have

FF

Preview


Ready to start your tutorial with us? That's great! Send us an email and we will get back to you as soon as possible!

Alert