Run Word Count Java Mapreduce Program in Hadoop

Jun 24, 2021 Word Count, Hadoop, Mapreduce, 7761 Views

In This Article, we'll discuss Run Word Count Java Mapreduce Program in Hadoop

How to create Jar file for Wordcount using eclipse IDE for Java.

1:- Create a Java project in Eclipse with name “WordCount”

2:- Create a class file named “WordCount.java” in src folder.

3:- Download hadoop-core.jar and hadoop-commons.jar.

4:-Right click on “WordCount” project -> Click on properties ->; Click on “;Java Build Path”-> Click on tab – “Libraries” -> Add External jars

5:-Select hadoop-core.jar. Click Apply and Close

6:- Add "Mapper", "Reducer" and "Driver" code to WordCount.java

import java.io.IOException;

import java.io.InvalidObjectException;

import java.util.StringTokenizer;

import java.util.Iterator;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCountDemo

{

public static class TokenizerMapper




extends Mapper<Object, Text, Text, IntWritable>

{




private final static IntWritable one = new IntWritable(1); private Text word = new Text();




public void map(Object key, Text value, Context context) throws IOException, InterruptedException

  {

     StringTokenizer itr = new StringTokenizer(value.toString());

     while (itr.hasMoreTokens())

      {

      word.set(itr.nextToken());

      context.write(word, one);

      }

  }

}

public static class IntSumReducer




extends Reducer<Text,IntWritable,Text,IntWritable>

{

     private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException

   {

    int sum = 0;

    for (IntWritable val : values)

       {

        sum += val.get();

       }

       result.set(sum); context.write(key, result);

    }

}

public static void main(String[] args) throws Exception

{

Configuration conf = new Configuration(); conf job = new conf();

//Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCountDemo.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1);

}

}

7:-Compile the project.

8:- Right click on project -> Export -> Jar -> Add location and name of jar.

9:- when jar file will be created go to Hadoop exp copy the file and open terminal when all the services of the Hadoop get started choose the file which want to count the words

10:-show the content of the file

11:- run command to word count of the file

hadoop jar wordCount.jar wordCount

12 :- after the program get executed the output will show like this

Nitin Pandit

With over 10 years of vast development experience with different technologies, Nitin Pandit is Microsoft certified Most Valued Professional (Microsoft MVP) with a rich skillset that includes developing and managing IT/Web-based applications in different technologies, such as – C#.NET, ADO.NET, LINQ to SQL, WCF, and ASP.NET 2.0/3.x/4.0, WCF, WPF, MVC 5.0 (Razor), and Silverlight, along with client-side programming techniques, like jQuery and AngularJS. Nitin possesses a Master’s degree in Computer Science and has been actively contributing to the development community for its betterment. He has written more than 100 blogs/articles and 3 eBooks on different technologies to help improve the knowledge of young technology professionals. He has trained more than one lakh students and professionals, as a speaker in workshops and AppFests, conducted in more than 25 universities in North India.

Run Word Count Java Mapreduce Program in Hadoop

1:- Create a Java project in Eclipse with name “WordCount”

2:- Create a class file named “WordCount.java” in src folder.

3:- Download hadoop-core.jar and hadoop-commons.jar.

5:-Select hadoop-core.jar. Click Apply and Close

Nitin Pandit

Related Article

Trending

Advertisement

COMPANY

CONTRIBUTE

MOBILE APP

Google Play

Related Article

Implementation of basic Hadoop commands

Evaluating execution time for multiplication of various multi-dimensional matrix in Hadoop.

A Comparative of Traditional RDBMS and HiveQL in Hadoop Enviromnent

Introduction to NoSQL Database

Run Word Count Java Mapreduce Program in Hadoop

1:- Create a Java project in Eclipse with name “WordCount”

2:- Create a class file named “WordCount.java” in src folder.

3:- Download hadoop-core.jar and hadoop-commons.jar.

5:-Select hadoop-core.jar. Click Apply and Close

Nitin Pandit

Related Article

Trending

Advertisement

COMPANY

JOIN TUTORIALS LINK

Our Newsletter Will Let You Know When Any NewArticles, Tutorials and Video Are Released.

CONTRIBUTE

MOBILE APP

Google Play

Follow us

Our Newsletter Will Let You Know When Any New
Articles, Tutorials and Video Are Released.