`
guoyunsky
  • 浏览: 839117 次
  • 性别: Icon_minigender_1
  • 来自: 上海
博客专栏
3d3a22a0-f00f-3227-8d03-d2bbe672af75
Heritrix源码分析
浏览量:203194
Group-logo
SQL的MapReduce...
浏览量:0
社区版块
存档分类
最新评论

HiveQL(Hive SQL)跟普通SQL最大区别

阅读更多

      微博:http://weibo.com/guoyunwb

 

      一直使用PIG,而今也需要兼顾HIVE。网上搜了点资料,感觉挺有用,这里翻译过来。翻译估计不太准确,待自己熟悉HIVE后再慢慢总结。

 

 * No true date/time data types, no interval types, and many missing UDFs for manipulating dates (e.g. ADD_MONTH)

* Strict type matching without support for automatic coercion or typed literals (e.g. CASE <bigint expr> WHEN 1 THEN ... END)

* All queries must reference a table (no 'dual' or table-less queries)

* No session-scoped temp tables

* No 'IN' predicate

* No 'FIND' string search function for producing the offset to a match

* No find/replace string functions for plain strings (i.e. not regex)

* XPATH UDFs cannot return a string representing an entire subtree in the DOM, which prevents composition.

* Few mechanisms for collapsing arrays to scalar types (e.g. 'join' complement of string 'split'; aggregations other than 'size' for numeric arrays; etc.)

粗略的翻译:

1.HiveQL没有真正的日期/时间类型,自增类型,以及操作日期和时间的一些函数如(ADD_MONTH)

2.HiveQL有着非常严格的类型匹配,不支持类型自动转换(如不支持: CASE big_int_number WHEN 1 THEN ... END),我的理解是big int类型不可以自动帮你转换为int

3.HiveQL只能对表进行查询,普通的SQL可以对结果集查询,如一般的嵌套查询)

4.HiveQL没有临时表的概念

5.HiveQL没有IN操作

6.HiveQL对于字符串没有FIND和REPLACE函数

7.HiveQL中的XPATH UDF不能够返回一个代表子DOM树的字符串实体,为了阻止composition.

8.Few mechanisms for collapsing arrays to scalar types (e.g. 'join' complement of string 'split'; aggregations other than 'size' for numeric arrays; etc.)

===========================================================================================================================================================

 

1.No windowing functions.  IE, SUM(sales) OVER (PARTITION BY date).  Its difficult to do a lot things common to warehousing, like a running sum, without having to write custom mappers/reducers or a UDF.

2.No regular UNION, INTERSECT, or MINUS operators.

3.Null values are treated differently than empty string, and are exported differently.  IE, empty strings are exported as '\n' and nulls are exported as nulls.  I know this isn't unique to Hive but still annoying when exporting data from Hive into another system.

4.No hierarchical/self referencing querying.  I know most distributed computing solutions can't do this, but it can be very handy.

5.No Update or Delete statements.

6.Haven't been able to find any kind of cost-based explain plans.  Running explain plans generally just shows the path of accessing data.  Useful to some degree but it would be great if it was more advanced in that it could help the user understand which steps are causing the biggest slowdowns.

=======================================================================================================================================================================

 

1. For row format delimiter for line termination, it only supports '\n'.

2. Hive does not support the ability to run a query that select from tables in more than one database.

3. Hive does not support sub-queries such as those connected by IN/EXISTS in the WHERE clause.

4. Hive does not support the truncation of data from a table.

===========================================================================================================================================================

 

 更多技术文章、感悟、分享、勾搭,请用微信扫描:

分享到:
评论

相关推荐

    HiveSQL详细和优化

    HiveSQL详细和优化,以及部分个人点评 This is a brief tutorial that provides an introduction on how to use Apache Hive HiveQL with Hadoop Distributed File System. This tutorial can be your first step ...

    hive编程指南中文版

    通过本书,读者可以很快学会如何使用Hive的SQL方言——HiveQL来汇总、查询和分析存储在Hadoop分布式文件系统上的大型数据集。 本书以实际案例为主线,详细介绍如何在用户环境下安装和配置Hive,并对Hadoop和...

    hive编程指南中文

    《Hive编程指南》是一本Apache Hive的编程指南 旨在介绍如何使用Hive的SQL方法 HiveQL来汇总 查询和分析存储在Hadoop分布式文件系统上的大数据集合 全书通过大量的实例 首先介绍如何在用户环境下安装和配置Hive 并对...

    Hive元数据库操作

    Hive运维中通常会用到操作元数据,这里提供了常用的sql语句

    Hive Succinctly

    Hive allows you to take data in Hadoop, apply a fixed external schema, and query the data with an SQL-like language. With Hive, complex queries can yield simpler, more effectively visualized results. ...

    Hive编程指南

    通过本书,读者可以很快学会如何使用Hive的SQL方言——HiveQL来汇总、查询和分析存储在Hadoop分布式文件系统上的大型数据集。 本书以实际案例为主线,详细介绍如何在用户环境下安装和配置Hive,并对Hadoop和...

    Practical Hive(Apress,2016)

    to resource for using Hive: authors Scott Shaw, Ankur Gupta, David Kjerrumgaard, and Andreas Francois Vermeulen take you through learning HiveQL, the SQL-like language specific to Hive, to analyze, ...

    Hive编程PDF

    《Hive编程指南》是一本ApacheHive的编程指南,旨在介绍如何使用Hive的SQL方法——HiveQL来汇总、查询和分析存储在Hadoop分布式文件系统上的大数据集合。《Hive编程指南》通过大量的实例,首先介绍如何在用户环境下...

    Practical.Hive.A.Guide.to.Hadoops.Data.Warehouse.System.1484202724

    to resource for using Hive: authors Scott Shaw, Ankur Gupta, David Kjerrumgaard, and Andreas Francois Vermeulen take you through learning HiveQL, the SQL-like language specific to Hive, to analyze, ...

    Hive编程指南.pdf

    《Hive编程指南》是一本Apache Hive的编程指南,旨在介绍如何使用Hive的SQL方法HiveQL来汇总、查询和分析存储在Hadoop分布式文件系统上的大数据集合。全书通过大量的实例,首先介绍如何在用户环境下安装和配置Hive,...

    Programming Hive

    With this example-driven guide, you'll learn how to use the Hive infrastructure to provide data summarization, query, and analysis - particularly with HiveQL, the query language dialect of SQL....

    Hive编程指南(扫描版)

    Hive编程指南是一本ApacheHive的编程指南,旨在介绍如何使用Hive的SQL方法——HiveQL来汇总、查询和分析存储在Hadoop分布式文件系统上的大数据集合。

    HIVE编程指南

    《Hive编程指南》是一本ApacheHive的编程指南,旨在介绍如何使用Hive的SQL方法——HiveQL来汇总、查询和分析存储在Hadoop分布式文件系统上的大数据集合。《Hive编程指南》通过大量的实例,首先介绍如何在用户环境下...

    Scriptis:Scriptis用于交互式数据分析,包括脚本开发(SQL,Pyspark,HiveQL),任务提交(Spark,Hive),UDF,功能,资源管理和智能诊断

    圣经 英文|介绍Scriptis用于交互式数据分析,包括脚本开发(SQL,Pyspark,HiveQL),任务提交(Spark,Hive),UDF,功能,资源管理和智能诊断。产品特点脚本编辑器:支持多语言,自动完成,语法突出显示和SQL语法...

    Scriptis是一款支持在线写SQLPysparkHiveQL等脚本提交给Linkis执行的数据分析Web工具

    Scriptis是一款支持在线写SQL、Pyspark、HiveQL等脚本,提交给Linkis执行的数据分析Web工具,且支持UDF、函数、资源管控和智能诊断等企业级特性。

    Hive - A Warehousing Solution Over a Map-Reduce.pdf

    queries expressed in a SQL-like declarative language - HiveQL, which are compiled into map-reduce jobs executed on Hadoop. In addition, HiveQL supports custom map-reduce scripts to be plugged into ...

    Hive指南.docx

    Hive指南,介绍如何使用Hive的SQL方法--HiveQL汇总、查询和分析存储在Hadoop上的的大数据集合。

    Apache Hadoop---Hive.docx

    Hive提供的是一种结构化数据的机制,定义了类似于传统关系数据库中的类SQL语言:HiveQL,通过该查询语言,数据分析人员可以很方便地运行数据分析业务(将SQL转化为MapReduce任务在Hadoop上执行)。

    SparkSql和DataFrame实战.docx

    Spark SQL的一个用途是执行使用基本SQL语法或HiveQL编写的SQL查询。Spark SQL还可以用于从现有的Hive安装中读取数据。有关如何配置此功能的更多信息,请参考Hive表格部分。当从另一种编程语言中运行SQL时,结果将...

Global site tag (gtag.js) - Google Analytics