select count(1) from s_ods_trade where part ='2012-10-31';
22076
select count(1) from s_ods_trade
104343
select count(1) from s_ods_trade_full where part ='2012-10-31';
11456
select count(1) from s_ods_trade_full
53049
SELECT count(1) FROM s_ods_trade a left outer JOIN s_ods_trade_full b ON (a.dp_id = b.dp_id AND a.tid = b.tid and a.part='2012-10-31' and b.part='2012-10-31');
104343
SELECT count(1) FROM s_ods_trade a left outer JOIN s_ods_trade_full b ON (a.dp_id = b.dp_id AND a.tid = b.tid and a.part=b.part and a.part='2012-10-31');
104343
SELECT count(1) FROM s_ods_trade a left outer JOIN s_ods_trade_full b ON (a.dp_id = b.dp_id AND a.tid = b.tid ) where a.part='2012-10-31' and b.part='2012-10-31';
11456
SELECT count(1) FROM s_ods_trade a left outer JOIN s_ods_trade_full b ON (a.dp_id = b.dp_id AND a.tid = b.tid and a.part=b.part) where a.part='2012-10-31';
22076
倒数第二个第三个sql很是不解。
最后一个sql效果相同语句,效率会高一些:
SELECT count(1) FROM (select * from s_ods_trade where part ='2012-10-31') a left outer JOIN (select * from s_ods_trade_full where part ='2012-10-31' ) b ON (a.dp_id = b.dp_id AND a.tid = b.tid);
官网解释https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins#:
Joins occur BEFORE WHERE CLAUSES. So, if you want to restrict the OUTPUT of a join, a requirement should be in the WHERE clause, otherwise it should be in the JOIN clause. A big point of confusion for this issue is partitioned tables:
SELECT a.val, b.val FROM a LEFT OUTER JOIN b ON (a.key=b.key)
WHERE a.ds='2009-07-07' AND b.ds='2009-07-07'
will join a on b, producing a list of a.val and b.val. The WHERE clause, however, can also reference other columns of a and b that are in the output of the join, and then filter them out. However, whenever a row from the JOIN has found a key for a and no key for b, all of the columns of b will be NULL, including the ds column. This is to say, you will filter out all rows of join output for which there was no valid b.key, and thus you have outsmarted your LEFT OUTER requirement. In other words, the LEFT OUTER part of the join is irrelevant if you reference any column of b in the WHERE clause. Instead, when OUTER JOINing, use this syntax:
SELECT a.val, b.val FROM a LEFT OUTER JOIN b
ON (a.key=b.key AND b.ds='2009-07-07' AND a.ds='2009-07-07')
..the result is that the output of the join is pre-filtered, and you won't get post-filtering trouble for rows that have a valid a.key but no matching b.key. The same logic applies to RIGHT and FULL joins.
- 大小: 45 KB
分享到:
相关推荐
hive sql + left join 数据缺失
SQL left join用法,初学者应用
Hive函数及语法说明;详细中文语法说明;兼容性说明
facebook hive中的各种join策略的slides,reporter是个Chinese。
里面 一个例子,说了几个需求,基本能符合,我要使用的功能,左外连接,还有exists替代方案,都很实用,结合文档看一下。
hive练习数据和hive练习题包含了hive练习数据,hive数据的建表ddl和hive练习题,非常...LEFT,RIGHT 和 FULL OUTER JOIN LEFT SEMI JOIN Hive当前没有实现 IN/EXISTS 子查询,可以用 LEFT SEMI JOIN 重写子查询语句。
NULL 博文链接:https://weigang-gao.iteye.com/blog/2260663
hive hive hive hive hive hive hive hive hive hive hive hive
Hive: Join Strategies. Facebook summit-2011
4. Hive Join 29 5. HIVE参数设置 31 6. HIVE UDF 33 6.1 基本函数 33 6.1.1 关系操作符 33 6.1.2 代数操作符 34 6.1.3 逻辑操作符 35 6.1.4 复杂类型操作符 35 6.1.5 内建函数 36 6.1.6 数学函数 36 6.1.7 集合函数...
Hive on Spark EXPLAIN statement : 讲述了 Common Join / Map join / Bucket Map Join / Sorted Merge Bucket Map Join / skew join 在explain 中的 树结构 。In Hive, command EXPLAIN can be used to show the ...
4. Hive Join 29 5. HIVE参数设置 31 6. HIVE UDF 33 6.1 基本函数 33 6.1.1 关系操作符 33 6.1.2 代数操作符 34 6.1.3 逻辑操作符 35 6.1.4 复杂类型操作符 35 6.1.5 内建函数 36 6.1.6 数学函数 36 6.1.7 集合函数...
使用hive3.1.2和spark3.0.0配置hive on spark的时候,发现官方下载的hive3.1.2和spark3.0.0不兼容,hive3.1.2对应的版本是spark2.3.0,而spark3.0.0对应的hadoop版本是hadoop2.6或hadoop2.7。 所以,如果想要使用高...
hive-jdbc
hive 下dual表,Lock,explain, 数据类型,开发常见的问题
1 Hive 概念与连接使用: 2 2 Hive支持的数据类型: 2 2.1原子数据类型: 2 2.2复杂数据类型: 2 2.3 Hive类型转换: 3 3 Hive创建/删除数据库 3 3.1创建数据库: 3 3.2 删除数据库: 3 4 Hive 表相关语句 3 4.1 Hive ...
01.hive查询语法--基本查询--条件查询--关联查询.mp4
Hive表生成工具,Hive表生成工具Hive表生成工具
利用Hive进行复杂用户行为大数据分析及优化案例(全套视频+课件...14_Hive中的数据倾斜及解决方案-三种join方式 15_Hive中的数据倾斜及解决方案-group by 16_Hive中使用正则加载数据 17_Hive中使用Python脚本进行预处理
《Hive数据仓库案例教程》教学课件 第5章 Hive数据操作.pdf《Hive数据仓库案例教程》教学课件 第5章 Hive数据操作.pdf《Hive数据仓库案例教程》教学课件 第5章 Hive数据操作.pdf《Hive数据仓库案例教程》教学课件 第...