Home > Software > Data-Warehouse > DataStage
Interview Questions   Tutorials   Discussions   Programs   Discussion   

DataStage - What is the exact difference between dataset and file set in data stage?




878
views
asked mar September 20, 2014 06:32 AM  

What is the exact difference between dataset and file set in data stage?


           

1 Answers



 
answered By vishnoiprem   0  
  • File set:- It allows you to read data from or write data to a file set. The stage can have a single input link. a single output link, and a single rejects link. It only executes in parallel modeThe data files and the file that lists them are called a file set. This capability is useful because some operating systems impose a 2 GB limit on the size of a file and you need to distribute files among nodes to prevent overruns. Datasets r used to import the data in parallel jobs like  odbc in server jobs.
  • Data set is the internally data format behind Orchestrate framework, so any other data being processed as source in parallel job would be converted into data set format first(it is handled by the operator "import") and also being processed as target would be converted from data set format last(it is handled by the operator "export"). Hence, data set usually could bring highest performance.
  • Data Set: it is a File Stage. It allows you to read Data from or to write data to a data set. The stage can have a single input link or a single output link. It can be configured to execute in parallel or sequential mode.
  • Data set is the internally data format behind Orchestrate framework, so any other data being processed as source in parallel job would be converted into data set format first(it is handled by the operator "import") and also being processed as target would be converted from data set format last(it is handled by the operator "export"). Hence, data set usually could bring highest performance.
  • it preserves partition.it stores data on the nodes so when you read from a dataset you dont have to repartition the data
  • it stores data in binary in the internal format of datastage.so it takes less time to read/write from ds to any other
  • It can not viewable directly, have to use data management tool.
flag   
   add comment

Your answer

Join with account you already have

FF

Preview


Ready to start your tutorial with us? That's great! Send us an email and we will get back to you as soon as possible!

Alert