Need to solve the word count program using hadoop streaming following

Need your ASSIGNMENT done? Use our paper writing service to score better and meet your deadline.


Order a Similar Paper HERE Order a Different Paper HERE

1. In class we wrote a MapReduce program in Java to compute the word counts for any given input. In this assignment, you will repeat solving the same problem but using Hadoop streaming. 

2. Create two scripts in Python namely wordcount_map.py and wordcount_reduce.py to be used by the mappers and reducers of the streaming job. 

3. Your script files must be executable (consider chmod command), and must include the necessary shebang (like in the attached script files).

 4. Attached are the script files we used in class to demonstrate Hadoop streaming, namely: maxtemp_map.py and maxtemp_reduce.py. They can help you to get started. 

5. Recall the streaming command:

 $ mapred streaming    

 -files <executable_map>,<executable_reduce>   

  -mapper <executable_map>    

 -reducer <executable_reduce>   

  -input <input-path>    

 -output <output-path> 

(extra options: -combiner, -numReduceTasks, etc.)

MaxTemperature Example file is the program file discussed in Class.