关于Oracle执行计划

mikixiyou

浏览: 1107626 次
性别:
来自: 南京

最近访客更多访客>>

ringphone

seahb

my_xinwen

stevenyj

博主相关

博客

微博

相册

留言

关于我

博客专栏

: Oracle管理和开发
浏览量：355752

文章分类

社区版块

存档分类

博客分类：

Oracle

oracle 执行计划提示 use_nl hash join

所谓执行计划

执行计划是指Oracle 运行的一条SQL 语句按照某一顺序操作的动作。

使用explain plan for 语句可以查看执行计划。在plsql developer 工具中，可以直接使用explain plan window 查看SQL 语句的执行计划。

SQL 语句为：

select  a.doc_id, a.content, b.title

  from cms_doc_body a, cms_doc_single_attr b

 where a.doc_id = b.doc_id

   and b.title like 'abc%'

   and a.content_type = 'text/plain'

得到的执行计划为：

SELECT STATEMENT, GOAL = ALL_ROWS 

NESTED LOOPS

  TABLE ACCESS BY INDEX ROWID 

   INDEX RANGE SCAN

  TABLE ACCESS BY INDEX ROWID

   INDEX UNIQUE SCAN

(miki西游 @mikixiyou 原文链接: http://mikixiyou.iteye.com/blog/1675896 )

如何阅读执行计划

执行计划应该是：由上至下，从右向左的顺序运行SQL 语句。

由上至下：在执行计划中一般含有多个节点，相同级别( 或并列) 的节点，靠上的优先执行，靠下的后执行

从右向左：在某个节点下还存在多个子节点，先从最靠右的子节点开始执行。

关于表访问方式

1.Full Table Scan (FTS) 全表扫描

2.Index Lookup 索引扫描

There are 5 methods of index lookup:

index unique scan -- 索引唯一扫描

Method for looking up a single key value via a unique index. always returns a single value, You must supply AT LEAST the leading column of the index to access data via the index.

index range scan -- 索引局部扫描

Index range scan is a method for accessing a range values of a particular column. AT LEAST the leading column of the index must be supplied to access data via the index. Can be used for range operations (e.g. > < <> >= <= between) .

index full scan -- 索引全局扫描

Full index scans are only available in the CBO as otherwise we are unable to determine whether a full scan would be a good idea or not. We choose an index Full Scan when we have statistics that indicate that it is going to be more efficient than a Full table scan and a sort. For example we may do a Full index scan when we do an unbounded scan of an index and want the data to be ordered in the index order.

index fast full scan -- 索引快速全局扫描，不带order by 情况下常发生

Scans all the block in the index, Rows are not returned in sorted order, Introduced in 7.3 and requires V733_PLANS_ENABLED=TRUE and CBO, may be hinted using INDEX_FFS hint, uses multiblock i/o, can be executed in parallel, can be used to access second column of concatenated indexes. This is because we are selecting all of the index.

index skip scan -- 索引跳跃扫描，where 条件列是非索引的前导列情况下常发生

Index skip scan finds rows even if the column is not the leading column of a concatenated index. It skips the first column(s) during the search.

3.Rowid 物理ID 扫描

This is the quickest access method available.Oracle retrieves the specified block and extracts the rows it is interested in. --Rowid 扫描是最快的访问数据方式

关于表连接方式

1. 嵌套循环(Nested Loops 简称NL )

两表嵌套循环连接的SQL 如下：

select /*+use_nl(a b) leading(b)*/

 a.doc_id, a.content, b.title

  from cms_doc_body a, cms_doc_single_attr b

 where a.doc_id = b.doc_id

   and b.title like 'abc%'

   and a.content_type = 'text/plain'

嵌套循环连接实现机制( 伪代码) 如下：

declare

begin

  for outer_table in (select doc_id, title

                        from cms_doc_single_attr

                       where title like 'abc%') loop

    for inner_table in (select doc_id, content

                          from cms_doc_body

                         where doc_id = outer_table.doc_id

                           and content_type = 'text/plain') loop

   

      dbms_output.put_line(inner_table.doc_id,

                           inner_table.content,

                           outer_table.title);

   

    end loop;

  end loop;

end;

这段代码由两个循环构成。

嵌套循环中的这两个表通常称为外部表（outer table) 和内部表(inner table) 。

在嵌套循环连接中, 在循环的外围，故称为外部表，同时它也是驱动表(driver table) 。

在伪代码中可以看出，outer_table 为外部表，也是驱动表，inner_table 为内部表。

从伪代码中可以看出该连接过程就是一个2 层嵌套循环，所以外层循环的次数越少越好，这也就是我们为什么将小表或返回较小结果集的表作为驱动表的原因。

NEST LOOP JOIN COST = 从第一个表取得数据的成本 + 从第一个表得到结果的基数 * 对第二个表访问一次的成本

所以嵌套循环一般适合于驱动表记录集比较少（<10000 ）且内表有高效索引访问方式。

使用USE_NL(a b) 可强制CBO 执行嵌套循环连接。在加上leading(b), 可以确保b 表为外部表，即驱动表。

驱动表确定：

驱动表一般为根据where 条件能得到较小结果集的表，而不一定是整个表记录比较小的表。

2. 哈希连接(Hash Join )

Hash join 一般用于一张小表和一张大表进行join 时。在绝大多数情况下，hash join 效率比其他join 方式效率更高。

Oracle 会选择一个表作为驱动表，先根据过滤条件排除不必要的数据，然后将结果集做成hash 表，放入进程的hash area ，接着扫描第二张表，将join 键值做hash 运算，到内存的hash 表里面去探测，如果探测成功，就返回数据，否则这行就丢弃掉。

3. 排序合并连接(Sort Merge Join)

通常情况下散列连接的效果都比排序合并连接要好，然而如果行源已经被排过序，在执行排序合并连接时不需要再排序了，这时排序合并连接的性能会优于散列连接。可以使用USE_MERGE(table_1 table_2) 来强制使用排序合并连接。

过程：将两个表排序，然后将排序后两个表合并。

应用场景

1、嵌套循环(nest loop)

对于被连接的数据子集较小的情况，嵌套循环连接是较好的选择。在嵌套循环中，外表驱动内表，外表返回的每一行都要在内表中检索找到它匹配的行，因此整个查询返回的结果集不能太大（大于10000 不合适），要把返回子集较小的表作为外表（驱动表），而且在内表的连接字段上一定要有索引。

2 、哈希连接(hash join)

哈希连接是大数据集连接时常用的方式，优化器使用两个表中较小的表，利用连接键在内存中建立散列表，然后扫描较大的表并探测散列表，找出与散列表匹配的行。

这种方式适用于较小的表完全可以放入内存的情况，这样成本就是访问两个表的成本之和。但是在表很大的情况下并不能完全放入内存，这时优化器将它分割成若干不同的分区，不能放入内存的部分就把该分区写入磁盘的临时段。

哈希连接只能应用于等值连接( 如WHERE A.COL3 = B.COL4) 、非等值连接(WHERE A.COL3 > B.COL4) 、外连接(WHERE A.COL3 = B.COL4(+)) 。

3 、排序合并连接（Sort Merge Join ）

通常情况下哈希连接的效果都比排序合并连接要好。然而如果行源已经被排过序，在执行排序合并连接时不需要再排序了，这时排序归并连接的性能会忧于哈希连接。

关于优化器和提示

Oracle 数据库中优化器（Optimizer ）是SQL 分析和执行的优化工具，它负责指定SQL 的执行计划，也就是它负责保证SQL 执行的效率最高，比如优化器决定Oracle 以什么样的方式来访问数据，是全表扫描（Full Table Scan ），索引范围扫描（Index Range Scan ）还是全索引快速扫描（INDEX Fast Full Scan ：INDEX_FFS ）; 对于表关联查询，它负责确定表之间以一种什么方式来关联，比如HASH_JOHN 还是NESTED LOOPS 或者MERGE JOIN 。这些因素直接决定SQL 的执行效率，所以优化器是SQL 执行的核心，它做出的执行计划好坏，直接决定着SQL 的执行效率。

但有时候，优化器生成的执行计划不一定是最优的，因此需要人工干预它。通过在select 或update 或delete 关键字后面加/*+xxx*/ 来实现，它称为hints ，即提示。

分享到：

Oracle中insert into select和select into ... | Oracle的pipelined函数实现高性能大数据处 ...

2012-09-11 15:52
浏览 2589
评论(0)
分类:数据库
查看更多

发表评论

您还没有登录,请您登录后再发表评论