`
jsh0401
  • 浏览: 10955 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

hive的数据操作

    博客分类:
  • hive
 
阅读更多
1.向表中装载数据
load data local inpath '${env:HOME}/california-employees'
overwrite into table employees
partition (country='US',state='CA');

2.通过查询语句向表中插入数据
insert overwrite/into table employees
partition (country='US',state='OR')
select * from staged_employees se
where se.cnty='US' and se.st='OR';

全表扫描一次
form staged_employees se
insert overwrite table employees
partition (country='US',state='OR')
select * where se.cnty='US' and se.st='OR'
insert overwrite table employees
partition (country='US',state='CA')
select * where se.cnty='US' and se.st='CA'
insert overwrite table employees
partition (country='US',state='IL')
select * where se.cnty='US' and se.st='IL';

动态分区插入
insert into table employees
partition (country,state)
select ...,se.cnty,se.st
from staged_employees se;

静态动态分区混合
insert into table employees
partition (country='US',state)
select ...,se.cnty,se.st
from staged_employees se
where se.cnty='US';

hive.exec.dynamic.partition
默认false,设置成true,表示开启动态分区功能

hive.exec.dynamic.partition.mode
默认strict,设置成nostrict,表示允许所有分区都是动态的

hive.exec.max.dynamic.partitions.pernode
默认100,每个mapper或reducer可以创建的最大动态分区个数

hive.exec.max.dynamic.partitions
默认+1000,一个动态分区创建语句可以创建的最大动态分区个数

hive.exec.max.created.files
默认100000,全局可以创建的最大文件个数


3.单个查询语句创建表并加载数据
create table ca_employees
as select name,salary,address
from employees se
where se.state='CA';

4.导出数据
haddop fs -cp source_path target_path

insert overwrite local directory '/tmp/ca_employees'
select name,salary,address
from employees
where se.state='CA'

指定多个输出文件夹目录
from staged_employees se
insert overwrite directory '/tmp/or_employees'
select * where se.cty='US' and se.st='OR'
insert overwrite directory '/tmp/ca_employees'
select * where se.cty='US' and se.st='CA'
insert overwrite directory '/tmp/il_employees'
select * where se.cty='US' and se.st='IL';


分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics