`
lancefox
  • 浏览: 63309 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

HIVE 笔记

 
阅读更多
===================================================================
异常片段:java.lang.NoSuchFieldError: ALLOW_UNQUOTED_CONTROL_CHARS
原因:jackson 这个包的版本与 hadoop 的 jackson 版本不一致,替换即可解决
===================================================================
异常片段:No matching method for class org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string). Possible choices: _FUNC_(int)  _FUNC_(bigint, string)  _FUNC_(bigint)  _FUNC_(int, string)
原因:执行SQL时,使用了from_unixtime函数,但参数用了非数值姓,就会产生此异常
解决办法:修改传给from_unixtime函数做参数的字段数据类型,脚本样例如下
alter table invites change bar bar int;
===================================================================




------------------------------------------------------------
创建表,分区列为 ds,以制表符 '\t' 作为列分割符
hive>
CREATE TABLE invites ( foo INT, bar STRING ) PARTITIONED BY (ds STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

创建表,将 select 的结果直接插入新表
hive>
CREATE TABLE ds_cnt1 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' AS select ds,count(ds) as cnt from invites group by ds;

创建外部表,数据存储在HDFS,但不在HIVE的托管目录下
hive>
CREATE EXTERNAL TABLE tableName (columnName type) LOCATION '/hdfspath';

//import data to external table.
//INSERT INTO TABLE tableName partition (b='1', c='1') SELECT * from otherTableName ...

------------------------------------------------------------

Loading data from flat files into Hive:
The keyword 'OVERWRITE' signifies that existing data in the table is deleted.
If the 'overwrite' keyword is omitted, data files are appended to existing data sets
从一个本地文件导入数据到 HIVE

hive>
LOAD DATA LOCAL INPATH '/home/hadoop/invites_2.txt' OVERWRITE INTO TABLE invites PARTITION (ds='2013-04-08');


从HDFS导入数据到 HIVE (去掉 LOCAL 关键字即可)

hive>
LOAD DATA INPATH '/home/hadoop/invites_2.txt' OVERWRITE INTO TABLE invites PARTITION (ds='2013-04-08');

------------------------------------------------------------
修改表名
hive>
ALTER TABLE invites RENAME TO invites_pk;


修改表:字段修改,foo是 string 型,修改成 myFoo int 型
hive>
ALTER TABLE invites CHANGE foo myFoo INT;

修改表:表结构修改,
假设之前结构为:
id string
age int
foo int
bar int
ds string
假设之后结构为:
id string
age string
foo string
bar string
ds string
hive>
ALTER TABLE invites REPLACE COLUMNS (id string,age string,foo string,bar string);

添加字段:
hive>
ALTER TABLE invites ADD COLUMNS (t1 string,t2 string);

ALTER TABLE invites DROP COLUMNS (t1 string,t2 string);


------------------------------------------------------------

查看HIVEQL执行计划 explain 关键字
hive>
explain select * from tblName ...

------------------------------------------------------------
如果要执行复杂的SQL,那么可以用python编辑脚本并执行,比较方便
#!/usr/bin/env python
#coding:utf-8

import sys
import subprocess

con1 = "2013/02/28"
if len(sys.argv) > 1 :
        con1 = sys.argv[1]
script1 = """
select concat( '#con1# ',main1.ct , ':00:00 --> ' , count(main1.rid) ) from (
        select substr( trim(sub1.maketime),1,2) ct , sub1.row_id rid
        from tbl_lrinputdetail_ext_pk sub1
        where not sub1.row_id like '%row%'
        and sub1.makedate like '%#con1#%'
) main1 group by main1.ct;
"""
script1 = script1.replace('#con1#',con1)
script1 = script1.replace('\t',' ')
script1 = script1.replace('\r\n',' ')
print script1
subprocess.call( 'echo "%s" | hive > out.log' % script1 , shell=True )
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics