MapReduce - Chaining Jobs in HADOOP(MAPREDUCE)

asked marvit November 18, 2014 09:04 PM  

Chaining Jobs in HADOOP(MAPREDUCE)


1 Answers

answered By Experts-976   0  
  • Not every problem can be solved with a MapReduce program, but fewer still are those which can be solved with a single MapReduce job.

  • Many problems can be solved with MapReduce, by writing several MapReduce steps which run in series to accomplish a goal: Map1 -> Reduce1 -> Map2 -> Reduce2 -> Map3...

  • You can easily chain jobs together in this fashion by writing multiple driver methods, one for each job.

  • Call the first driver method, which uses JobClient.runJob() to run the job and wait for it to complete. When that job has completed, then call the next driver method, which creates a new JobConf object referring to different instances of Mapper and Reducer, etc.

