Hadoop 自定义计数器

Genie13

浏览: 187877 次

最近访客更多访客>>

elashu

canofy

longzhiwuing

mmhotsky

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Hadoop

public static class mapper extends Mapper<Text, BytesWritable, Text , Text>{

		private Counter c ;
		@Override
		protected void setup(Context context) throws IOException,
				InterruptedException {
			c = context.getCounter("FILE", "COUNT");
		}

		@Override
		protected void map(Text key, BytesWritable value, Context context)
				throws IOException, InterruptedException {
			c.increment(1);
			context.write(key, new Text(value.getBytes()));
		}

		@Override
		protected void cleanup(Context context)
				throws IOException, InterruptedException {
			
		}
		
	}
	
	public static class reducer extends Reducer<Text, Text, Text, Text>{
		
		@Override
		protected void reduce(Text arg0, Iterable<Text> arg1,
				Context context)
				throws IOException, InterruptedException {
			Iterator<Text> itr = arg1.iterator();
			while(itr.hasNext()){
				itr.next();
			}
			
			context.write(arg0, new Text("heihei"));
		}
		
	}
	
	public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException{
		
		Configuration conf = new Configuration();
		Job job = new Job(conf);
		job.setJarByClass(testCounter.class);
		
		FileInputFormat.addInputPath(job, new Path(args[0]));
		FileOutputFormat.setOutputPath(job, new Path(args[1]));
		
		job.setMapperClass(mapper.class);
		job.setReducerClass(reducer.class);
		
		job.setInputFormatClass(WholeFileInputFormat.class);
		job.setOutputFormatClass(TextOutputFormat.class);
		
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(Text.class);
		
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(Text.class);
		
		job.waitForCompletion(true);
		
		String num = job.getCounters().findCounter("Map-Reduce Framework", "Map input records").getName();
		System.out.println(num);
	}

获取计数器只能在job完成之后，也就是job.waitForCompletion(true);之后，放在之前的话回报一个非法安全的错误，但是在reducer函数里面，则不错报错，也不能获取计数器值，可能设计人员没有扑捉reducer里面的异常吧...

计数器有自定义计数器和内置计数器

静态计数器和动态计数器，前者用枚举，更安全一些，后者直接用字符串，有时为了显示方便，会创建一个properties属性文件。

分享到：

Hadoop SequcenceFile 处理多个小文件 | Lucene Filter CachingWrapperFilter

2012-04-22 09:04
浏览 1454
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Hadoop 自定义计数器

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Hadoop 自定义计数器

评论

发表评论

相关推荐

多表join的一个优化思路

好的网站

Hadoop 任务流程

Hadoop关于最大map reducer数目

java.io.IOException:Typemismatch in key from map:expected org.apache.hadoop.io

HDFS 输入文件避免切分

Hadoop 开启debug信息

Hadoop 关于0.95/1.75 * （number of nodes）误解

MapReduce ReadingList

"hadoop fs 和hadoop dfs的区别"

Hadoop 自动清除日志

DistributedCache FileNotFoundException

Cygwin 不支持native lib 不支持使用native lib 提供的压缩

Hadoop 在Window下搭建 守护进程启动问题

Cygwin ssh Connection closed by ::1

Eclipse：Run on Hadoop 没有反应

Hadoop SequcenceFile 处理多个小文件

MapReduce : 新版API 自定义InputFormat 把整个文件作为一条记录处理

MapReduce ： Combiner的使用(以平均数为例) 并结合in-mapper design pattern 实例

Hadoop NameNode backup

最近访客更多访客>>

Hadoop 在Window下搭建守护进程启动问题