编写hive udf函数

thomas0988

浏览: 474211 次
性别:
来自: 南阳

最近访客更多访客>>

zzc125

yuyuanhua

小小书僮

dingli123

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

数据层
nosql
专业知识
hadoop

udf编写入门
大写转小写
package com.afan;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class UDFLower extends UDF{
   public Text evaluate(final Text s){
       if (null == s){
           return null;
       }
       return new Text(s.toString().toLowerCase());
   }
}

1加载udf jar包
afan@ubuntu:/usr/local/hadoop/hive$ bin/hive
Hive history file=/tmp/afan/hive_job_log_afan_201105150623_175667077.txt
hive> add jar udf_hive.jar;
Added udf_hive.jar to class path
Added resource: udf_hive.jar
2 创建udf函数
hive> create temporary function my_lower as 'com.afan.UDFLower';
OK
Time taken: 0.253 seconds
3 创建测试数据
hive> create table dual (info string);
OK
Time taken: 0.178 seconds
hive> load data local inpath 'data.txt' into table dual;
Copying data from file:/usr/local/hadoop/hive/data.txt
Copying file: file:/usr/local/hadoop/hive/data.txt
Loading data to table default.dual
OK
Time taken: 0.377 seconds
hive> select info from dual;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201105150525_0003, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201105150525_0003
Kill Command = /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201105150525_0003
2011-05-15 06:46:05,459 Stage-1 map = 0%, reduce = 0%
2011-05-15 06:46:10,905 Stage-1 map = 100%, reduce = 0%
2011-05-15 06:46:13,963 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201105150525_0003
OK
WHO
AM
I
HELLO
worLd

Time taken: 14.874 seconds
4使用udf函数
hive> select my_lower(info) from dual;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201105150525_0002, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201105150525_0002
Kill Command = /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201105150525_0002
2011-05-15 06:43:26,100 Stage-1 map = 0%, reduce = 0%
2011-05-15 06:43:34,364 Stage-1 map = 100%, reduce = 0%
2011-05-15 06:43:37,484 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201105150525_0002
OK
who
am
i
hello
world

Time taken: 20.834 seconds

http://blog.sina.com.cn/s/blog_61c463090100rh4j.html

分享到：

Hive HBase 整合(中文) | Hive与HBase的整合

2012-05-04 19:13
浏览 1252
评论(0)
分类:数据库
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论