`

hadoop 运行自带包的单词计数位置和写法

 
阅读更多

 

0 准备文件 test 内容如下,中间用 \t间隔

[root@hadoop3 ~]# cat test 
hello   you
hello   me

 

 

 1 找到如下路径

hadoop2.5.2/share/hadoop/mapreduce:  位置下找到 example.jar 

 

 2 执行如下命令:

[root@hadoop3 mapreduce]# hadoop jar hadoop-mapreduce-examples-2.5.2.jar   wordcount /input/test /output

 

 

其中,如果不知道能运行的主函数名称 可以使用:

 

hadoop jar hadoop-mapreduce-examples.jar 然后回车

此时会提示 可供调用的主函数名词, eg:

 

[root@hadoop3 mapreduce]# hadoop jar hadoop-mapreduce-examples-2.5.2.jar  
An example program must be given as the first argument.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.

 

 

运行结果如下:

 

hello	2
me	1
you	1

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics