代码如下:
import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; public class WordCountNew extends Configured implements Tool{ public static class WordCountReduceMapper extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); @Override protected void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException { // 读取一行数据 String line = value.toString(); // 用空格分割成单词 StringTokenizer st = new StringTokenizer(line); // 将单词与数量输出 while (st.hasMoreTokens()) { word.set(st.nextToken()); context.write(word, one); } } } public static class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); /** * 相同key的数据会一起传送给reduce,因此reduce阶段中的key值是相同,values 可以是多个,其值都是1 */ @Override protected void reduce(Text key, Iterable<IntWritable> values,Context context) throws IOException, InterruptedException { int sum = 0; // 通过循环,得出values的和 for (IntWritable value : values) { sum += value.get(); } result.set(sum); context.write(key, result); } } @Override public int run(String[] args) throws Exception { Job job = new Job(getConf()); job.setJarByClass(WordCountNew.class); job.setJobName("mywordcount"); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordCountReduceMapper.class); job.setReducerClass(WordCountReduce.class); job.setCombinerClass(WordCountReduce.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); boolean success = job.waitForCompletion(true); return success ? 0:1; } public static void main(String[] args) throws Exception { int result = ToolRunner.run(new WordCountNew(), args); System.exit(result); } }
相关推荐
Hadoop2.6版本稳定版API文档CHM文件
hadoop中文版API.chm文件,查找hadoop的类中方法、方法的用法等,方便、好用
赠送jar包:hadoop-yarn-api-2.5.1.jar; 赠送原API文档:hadoop-yarn-api-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.5.1.pom; 包含翻译后...
教你如何查看API及使用hadoop新api编程
Hadoop1.0.0权威API参考
教你如何查看API及使用hadoop新api编程 高清完整版PDF下载
【最新推荐】hadoop,开题报告-优秀word范文 (8页).docx【最新推荐】hadoop,开题报告-优秀word范文 (8页).docx【最新推荐】hadoop,开题报告-优秀word范文 (8页).docx【最新推荐】hadoop,开题报告-优秀word范文 (8页)....
hadoop api hadoop apihadoop api hadoop api hadoop api
hadoop filesystem api常见使用说明
Hadoop是Apache Lucene的创始人 Doung Cutting 创建的, Hadoop起源于Apache Nutch, 一个开源的网络搜索引擎,也是Apache的Lucene项目的一部分。Hadoop是创始人Doung Cutting的儿子给一头大象起的名字。 Hadoop的子...
hadoop 中文版 API!!欢迎下载参考。。。。。。。。。
hadoop2.6-api.zip,直接解压就能用,index直接打开就行
赠送jar包:hadoop-yarn-api-2.7.3.jar; 赠送原API文档:hadoop-yarn-api-2.7.3-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.7.3.pom; 包含翻译后...
赠送jar包:hadoop-yarn-api-2.5.1.jar; 赠送原API文档:hadoop-yarn-api-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.5.1-sources.jar; 包含翻译后的API文档:hadoop-yarn-api-2.5.1-javadoc-API文档-...
【最新推荐】hadoop,开题报告-优秀word范文 (8页).pdf【最新推荐】hadoop,开题报告-优秀word范文 (8页).pdf【最新推荐】hadoop,开题报告-优秀word范文 (8页).pdf【最新推荐】hadoop,开题报告-优秀word范文 (8页).pdf...
Hadoop 2.10.0中文版API
hadoop0.23.9离线api,解压后进入 doc 目录 ,双击 index.html 和javaapi一样 All Classes Packages org.apache.hadoop org.apache.hadoop.classification org.apache.hadoop.conf org.apache.hadoop.contrib....
开源分布式架构Hadoop 0.20.2版的Java API文档,使用Hadoop都需要它
赠送jar包:hadoop-yarn-api-2.6.5.jar; 赠送原API文档:hadoop-yarn-api-2.6.5-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.6.5.pom; 包含翻译后...
Hadoop API使用材料,与大家共享!