mapred代码示例--带命令行参数运行job -

jsh0401

浏览: 10956 次
性别:
来自: 北京

最近访客更多访客>>

zhangyi0618

koberichard

freeman01

马智霖

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

mapred代码示例--带命令行参数运行job

博客分类：

hadoop1.x

package cmd;

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

/**
* 在命令行使用
* hadoop jar xx.jar INPUT_PATH OUT_PATH
*
* @author Administrator
*
*/
public class WordCountApp extends Configured implements Tool{

private static String INPUT_PATH = "";
private static String OUT_PATH = "";

/**
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {

ToolRunner.run(new WordCountApp(), args);

}

@Override
public int run(String[] args) throws Exception {
INPUT_PATH = args[0];
OUT_PATH = args[1];

Job job = new Job();
job.setJarByClass(WordCountApp.class);
FileInputFormat.setInputPaths(job, INPUT_PATH);
job.setInputFormatClass(TextInputFormat.class);
job.setMapperClass(MyMapper.class);

job.setReducerClass(MyReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileOutputFormat.setOutputPath(job, new Path(OUT_PATH));
job.waitForCompletion(true);

return 0;
}

public static class MyMapper extends Mapper<LongWritable, Text, Text, LongWritable>{
protected void map(LongWritable key, Text value, org.apache.hadoop.mapreduce.Mapper<LongWritable,Text,Text,LongWritable>.Context context) throws java.io.IOException ,InterruptedException {
final String[] splited = value.toString().split("\t");
for (String word : splited) {
final Text key2 = new Text(word);
final LongWritable value2 = new LongWritable(1L);
context.write(key2, value2);
}
};
}

public static class MyReducer extends Reducer<Text,LongWritable, Text,LongWritable>{
protected void reduce(Text key2, java.lang.Iterable<LongWritable> value2s, org.apache.hadoop.mapreduce.Reducer<Text,LongWritable,Text,LongWritable>.Context context) throws java.io.IOException ,InterruptedException {
Long sum = 0L;
for (LongWritable value2 : value2s) {
sum += value2.get();
}
context.write(key2, new LongWritable(sum));
};
}

}

分享到：

mapred代码示例--旧api的写法 | hadoop1.x环境搭建

2014-09-01 10:31
浏览 834
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

mapred代码示例--带命令行参数运行job

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

mapred代码示例--带命令行参数运行job

评论

发表评论

相关推荐

mapred代码示例--自定义分组

mapred代码示例--二次排序

mapred代码示例--reduce端join

mapred代码示例--map端join

mapred代码示例--自定义分区

mapred代码示例--map阶段使用combiner（归约）

mapred代码示例--自定义计数器

mapred代码示例--旧api的写法

hadoop1.x环境搭建

最近访客更多访客>>