Mapreduce《案例之平均分》 -

bigSeven

浏览: 40654 次
性别:
来自: 深圳

最近访客更多访客>>

icedcoco

hackeryutu

yokoboy

锋之弥漫

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

Mapreduce《案例之平均分》

博客分类：

分布式

Mapreduce hadoop demo 平均分

Mapreduce《案例之数据排序》

数据源：

a.txt 内容：

aaa 120

bbb 100

ccc 130

ddd 150

b.txt内容：

aaa 121

bbb 101

ccc 131

ddd 150

c.txt内容

aaa 119

bbb 99

ccc 129

ddd 150

输出结果：

aaa120

bbb100

ccc130

ddd150

===========================java code==========================

package gq;

import java.io.IOException;

import java.util.Iterator;

import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

/**

* Class Description:求平均分测试类

* Author:gaoqi

* Date:2015年6月5日下午2:03:08

public class AvgScore {

public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {

public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException{

String line = value.toString().trim();

StringTokenizer stk = new StringTokenizer(line,"/n");

while(stk.hasMoreElements()){

StringTokenizer sk = new StringTokenizer(stk.nextToken());

context.write(new Text(sk.nextToken()), new IntWritable(Integer.parseInt(sk.nextToken())));

}

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException{

Iterator<IntWritable> its = values.iterator();

int sum = 0;

int cnt = 0;

while(its.hasNext()){

sum += its.next().get();

cnt++;

}

context.write(key, new IntWritable(sum/cnt));

}

public static void main(String[] args) throws Exception {

Configuration conf = new Configuration();

Job job = new Job(conf, "AvgScore");

job.setJarByClass(AvgScore.class);

job.setMapperClass(Map.class);

job.setCombinerClass(Reduce.class);

job.setReducerClass(Reduce.class);

job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class);

job.setInputFormatClass(TextInputFormat.class);

job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path("hdfs://h0:9000/user/tallqi/in/inputAvgScore"));

FileOutputFormat.setOutputPath(job, new Path("hdfs://h0:9000/user/tallqi/in/outputAvgScore"));

System.exit(job.waitForCompletion(true)?0:1);

}

分享到：

Mapreduce《案例之数据去重复》 | Mapreduce《案例之两表连接》

2015-08-15 16:49
浏览 523
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Mapreduce《案例之平均分》

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Mapreduce《案例之平均分》

评论

发表评论

相关推荐

Mapreduce《案例之数据去重复》

Mapreduce《案例之两表连接》

Mapreduce《案例之内连接》

Mapreduce《案例之倒排索引》

hadoop0.20.2完全分布式安装和配置

Zookeeper简介

最近访客更多访客>>