1) When you use sequential file as Source, at the time of Compilation it will convert to native format from ASCII.where as, when you go for using datasets conversion is not required. Also, by default sequential files we be Processed in sequence only. Sequential files can accommodate up to 2GB only. Sequential files does not support NULL values.All the above can me overcome using dataset Stage,but selection is depends on the Requirement.suppose if you want to capture rejected data in that case you need to use sequential file or file set stage. 2) Sequential file is used to Extract the data from flat files and load the data into flat files and limit is 2GB.Dataset is a intermediate stage and it has parallelism when load data into dataset and it improve the performance.
3) Data set mainly consists of two files. a) Descriptor file which consists of Metada,data location but not actual data itself b) Data file contains the data in multiple files and one file file per partition.
4) Orchadmin command is used to delete the datasets where as rm unix command is used to remove the flat files.
Ready to start your tutorial with us? That's great! Send us an email and we will get back to you as soon as possible!