`

《Pro Oracle SQL》Chapter7 -- 7.8Extending Reports With GROUPING_ID()

阅读更多

    7.8 Extending Reports With GROUPING_ID() 使用GROUPING_ID()扩展报告   (page210)

    The GROUPING_ID()function is relatively new compared to the GROUPING() function, having been
introduced in Oracle 9i, and is somewhat similar to the GROUPING() function.  Whereas GROUPING()
evaluates the expression and returns a 0 or 1, the GROUPING_ID() function evaluates an expression,
determines which, if any, of the columns in its arguments are being used to generate a superaggregate row, creates a bit vector, and returns that value as an integer.  
    GROUPING_ID()函数相比GROUPING()函数要新一些,Oracle9i引入的,一定程度上相似GROUPING()函数。 GROUPING()评估表达式返回0或者1,GROUPING_ID函数评估表达式,如果可能,判断哪些参数列用于生成一超聚合行,创建一位矢量,返回一整数值。
    Perhaps it would be simpler to see how GROUPING_ID() works with an example.  The SQL in  
Listing 7-15 first creates a single row consisting of two columns, BIT_1 and BIT_0, with values of 1 and 0 respectively. The subquery cubed uses GROUP BY CUBE to generate four rows from the single row of input.  The GROUPING_ID() function returns the decimal value of the bit vector that represents the actions of CUBE to the current row. The first two uses of the GROUPING() function then create a 1 or 0 based on the actions of CUBE on the row. These will be used to create a bit vector in the final output.  The next two GROUPING() functions then create values displayed in the final output indicating which column that CUBE is currently working on.  The final output displays the decimal bit vector, as well as a binary representation of the vector.  As would be expected with two binary digits, there are four rows of output. 

    可能通过示例容易理解GROUPING_ID()如何工作的。在列表7-15的SQL中首先创建由两列,BIT1和BIT0,组成的单行,分别对应值1 和0。子查询模块cubed使用GROUP BY CUBE从输入的单行生成4行。GROUPING_ID()函数返回为矢量的十进制值,代表CUBE对当前行的动作。 前两个使用GROUPING()函数基于行上的CUBE动作创建1或者0。这最终用于在最后的输出中创建位图矢量。接下来的两个GROUPING()函数用于创建显示在最终输出中,用于指示出CUBE当前作用的那列。最终输出显示十进制的位图矢量,还有矢量的二进制标示。
Listing 7-15. GROUPING_ID() Bit Vector
  1  with rowgen as (
  2    select 1 bit_1, 0 bit_0
  3    from dual
  4  ),
  5  cubed as (
  6    select
  7      grouping_id(bit_1,bit_0) gid
  8      , to_char(grouping(bit_1)) bv_1
  9      , to_char(grouping(bit_0)) bv_0
 10      , decode(grouping(bit_1),1,'GRP BIT 1') gb_1
 11      , decode(grouping(bit_0),1,'GRP BIT 0') gb_0
 12    from rowgen
 13    group by cube(bit_1,bit_0)
 14  )
 15  select
 16    gid
 17    , bv_1 || bv_0 bit_vector
 18    , gb_1
 19    , gb_0
 20  from cubed
 21  order by gid; 
              BIT              GROUPING             GROUPING
 GID       VECTOR          BIT 1                     BIT 0
----          ------           ---------             ---------
   0          00
   1          01               GRP BIT 0
   2          10               GRP BIT 1
   3          11               GRP BIT 1              GRP BIT 0
 
4 rows selected. 

    So, what good is it?  You already know how to use GROUPING() to control output via the HAVING clause,  why learn another way?  Those are fair question when you consider that the examples in Listing 7-14  can already create the wanted output.
   既然这样,它有什么好处?你已经知道通过HAVING子句如何使用GROUPING()控制输出,为什么还要学习其它方式?当你想到列表7-14中的例子已经能创建想要的输出了,这就是最直接的问题了
    In the interest of database efficiency, a single GROUPING_ID() call can be used to replace all of the
different HAVING GROUPING() clauses from Listing 7-14.   The GROUPING() function is limited in its ability to discriminate rows, as it can return only a 0 or 1.  Since the GROUPING_ID() function returns a decimal value based on a bit vector, it can easily be used to make many different comparisons without any changes to the SQL.
    从数据库效率的观点看,单独的GROUPING_ID调用能用于替换列表7-14中的所有不同HAVING GROUPING()子句。GROUPING()函数辨别行的能力有限,因为它只能返回0或者1.因为GROUPING_ID()函数返回一基于位矢量的十进制值,它能易于作出多种比较而不需要修改SQL。
    Why should you care about changing comparisons without changing the SQL? If you are building an application based on the sales history examples, the user may be given four choices of output, and any one or more of them may be chosen.  The user choices can be used as inputs to a single SQL statement that uses HAVING GROUPING_ID(), rather than multiple SQL statements based on different combinations of HAVING GROUPING(), so it requires less parsing of SQL by the database. It will also result in fewer SQL statements to execute, less IO, and less memory usage.    
    为什么你关注于改变比较而不修改SQL?如果你建立了一个基于销售历史例子的应用,用户可能给出输出的四种选择,且可能选择任何一种或多种。用户的选择可看作是使用HAVING GROUPING_ID()的单个SQL语句的输入,而不是基于HAVING GROUPING()的不同组合的多个SQL语句, 如此以来它就可以更少的依赖数据库解析SQL。它也能导致更少的SQL语句执行,更少的IO,和更少的内存使用。
    Just as using CUBE eliminated multiple SQL statements joined by UNION ALL, GROUPING_ID() can
eliminate multiple SQL statements in your application.  The choices given to the user will be as follows: ALL DATA– Display all income level and age range aggregations
ALL AGE– Aggregate all age ranges together
ALL INCOME– Aggregate all income levels together
SUMMARY – Summary only
     正如使用CUBE能消除多个用UNION ALL连接的SQL语句一样,GROUPING_ID能消除你应用中的多个SQL语句。给用户有如下四种选择:
ALL DATA– 显示所有收入层级和年龄范围的聚合
ALL AGE– 聚合所有年龄范围的
ALL INCOME– 聚合所有收入层级的
SUMMARY – 所有的汇总
    The application, a SQL*Plus script in this case, will assign to variables values corresponding to the
user’s choices. The SQL statement, in turn, will evaluate those variables via HAVING GROUPING_ID() to output the requested rows. Listing 7-16  simulates the choices a user might make and demonstrates how to use these inputs in the SQL.  In the example, the only rows to be output will be those that are aggregates of all income levels regardless of age group (ALL_AGE) and the summary columns for each
product category (ALL_AGE and ALL_INCOME_LEVEL). 
  This is accomplished by setting N_AGE_RANGE and N_SUMMARY to 2 and 4, respectively. These values correspond to the bit vector generated by the
GROUPING_ID() function found in the HAVING clause. 
    这个应用,这次使用的是SQL*Plus脚本,将对应于用户的选择赋予变量值。SQL语句,最终,将通过HAVING GROUPING_ID()评估这些变量输出要求的行。列表7-16模拟一用户可能做出的选择和演示如何在SQL中使用这些输入 例子中,只有那些“所有收入层级而不论年龄组(ALL_AGE)的聚合和每一产品种类(不论ALL_AGE 和ALL_INCOME_LEVEL)的汇总”的行才得以输出。 通过分别设定变量N_AGE_RANGE和N_SUMMERY成2和4得以实现。这些值对应于HAVING子句中的GROUPING()函数产生的位矢量。
    As used in the HAVING clause, 1 is added to the value generated by GROUPING_ID().  This is to enable some consistency in setting the values of the variables that control the output.  Without adding 1 to the value, the N_ALL_DATA variable would be set to 0 to enable output, and some other value, such as -1 to disable it.  Increasing this comparison values by 1 makes it possible to consistently use 0 as a value to disable output. 
     在HAVING子句中的运用中,由GROUPING_ID()生成的值都加了1.这是为了能一致性的设置变量值而控制输出。 不加1的话,N_ALL_DATA变量要置成0才能激活输出,而用其它的值,如-1才能失效输出。把比较值加1就使得一致性的用0值失效输出成为可能。
Listing 7-16. GROUPING_ID() To Control Report Output
SQL> variable N_ALL_DATA number
SQL> variable N_AGE_RANGE number
SQL> variable N_INCOME_LEVEL number
SQL> variable N_SUMMARY number
SQL>
SQL> begin
  2    -- set values to 0 to disable
  3    :N_ALL_DATA     := 0; -- 1 to enable
  4    :N_AGE_RANGE    := 2; -- 2 to enable
  5    :N_INCOME_LEVEL := 0; -- 3 to enable
  6    :N_SUMMARY      := 4; -- 4 to enable
  7  end;
  8  /
 
  1  with tsales as (
  2  select /*+ gather_plan_statistics */
  3      s.quantity_sold
  4      , s.amount_sold
  5      , to_char(mod(cust_year_of_birth,10) * 10 ) || '-' ||
  6        to_char((mod(cust_year_of_birth,10) * 10 ) + 10) age_range
  7      , nvl(c.cust_income_level,'A: Below 30,000') cust_income_level
  8      , p.prod_name
  9      , p.prod_desc
 10      , p.prod_category 
11      , (pf.unit_cost * s.quantity_sold)  total_cost
 12      , s.amount_sold - (pf.unit_cost * s.quantity_sold)  profit
 13    from sh.sales s
 14    join sh.customers c on c.cust_id = s.cust_id
 15    join sh.products p on p.prod_id = s.prod_id
 16    join sh.times t on t.time_id = s.time_id
 17    join sh.costs pf on
 18      pf.channel_id = s.channel_id
 19      and pf.prod_id = s.prod_id
 20      and pf.promo_id = s.promo_id
 21      and pf.time_id = s.time_id
 22    where  (t.fiscal_year = 2001)
 23  )
 24  select
 25    'Q' || to_char(grouping_id(cust_income_level,age_range)+1) query_tag
 26    , prod_category
 27    , decode(grouping(cust_income_level),1,'ALL INCOME',cust_income_level)
cust_income_level
 28    , decode(grouping(age_range),1,'ALL AGE',age_range) age_range
 29    , sum(profit) profit
 30  from tsales
 31  group by prod_category, cube(cust_income_level,age_range)
 32    having grouping_id(cust_income_level,age_range)+1
 33      in(:N_ALL_DATA,:N_AGE_RANGE,:N_INCOME_LEVEL,:N_SUMMARY)
 34  order by prod_category, profit; 

QUERY                                                      AGE
TAG    PRODUCT CATEGORY               INCOME LEVEL         RANGE             PROFIT
------ ------------------------------ -------------------- -------- ---------------
...
Q2     Hardware                       K: 250,000 - 299,999 ALL AGE       $26,678.00
Q2     Hardware                       L: 300,000 and above ALL AGE       $28,974.28
Q2     Hardware                       J: 190,000 - 249,999 ALL AGE       $43,761.47
...
Q2     Hardware                       E: 90,000 - 109,999  ALL AGE      $135,154.59
Q2     Hardware                       F: 110,000 - 129,999 ALL AGE      $199,270.01
Q4     Hardware                       ALL INCOME           ALL AGE      $987,386.78
...
65 rows selected. 
    To be fair, it is possible to achieve the same results using the GROUPING() function, but it requires
several tests to be placed in the HAVING clause.  The queries of sample sales history data include only two columns in the CUBE arguments.  The total number of tests required in the HAVING clause is four, as GROUPING() clause will return either a 1 or a 0, so there are two possible values for each of your columns, resulting in four tests.  That doesn’t seem too bad, but consider what happens when there three columns: the number of tests goes up to nine.  The number of tests required will be 2^n
 where n is the number of columns or expressions in arguments to CUBE.   
    一般而言,是有可能使用GROUPING()函数取得相同的结果的,但是它需要在HAVING子句中测试几次。样本销售历史数据的查询仅包含CUBE参数中的两列。在HAVING子句中所需的总测试数为4次,因为GROUPING()子句要么返回1要么返回0,你的每一列就有两种可能值,导致要4次测试。这看上去似乎还不太糟,但是试想一下有3列的情况:测试次数达到9次。测试数将是2的n次方,n是CUBE参数列表中的列或者表达式。
    Listing 7-17 shows the HAVING clause as it might appear using GROUPING() rather than GROUPING_ID().  This approach would soon become unwieldy if there were many arguments required for the CUBE extension.  The four separate tests shown should not be too much trouble to maintain. However, if the number of column in the CUBE arguments goes up from two to three, there will then be nine tests. This is not code that lends itself well to maintenance.
    列表7-17展示了使用GROUPING()而不是GROUPING_ID()的HAVING子句可能的场景。如果CUBE扩展式中所需的参数很多,这种方法很快将变得笨重。所示的四个独立测试还不是太难维护。然而,如果CUBE参数的列数量由2涨到3,就要进行9次测试。这不是一段自维护性好的代码。
Listing 7-17.  Using GROUPING() instead of GROUPING_ID()
32 having  -- bin_to_num() requires 9i+
33   ( bin_to_num(grouping(cust_income_level), grouping(age_range))+1 = :N_ALL_DATA)
34   or ( bin_to_num(grouping(cust_income_level), grouping(age_range))+1 = :N_AGE_RANGE)
35   or ( bin_to_num(grouping(cust_income_level), grouping(age_range))+1 = :N_INCOME_LEVEL)
36   or ( bin_to_num(grouping(cust_income_level), grouping(age_range))+1 = :N_SUMMARY)

EXPERIMENT WITH GROUPING() AND GROUPING_ID()

As an exercise, modify the code from Listing 7-16 so that it adds another column to the arguments to
CUBE.  Then modify the call to GROUPING_ID() in the HAVING clause to work with the new column.  This will require a new variable as well in the PL/SQL block.  
After you have that working, replace the GROUPING_ID() call with all the tests needed to accomplish the same output control with GROUPING().  Do you like the results?  Is this code that you would like to maintain? 

对 GROUPING() 和 GROUPING_ID() 进行实验
作为一个练习,修改列表7-16的代码,对CUBE参数再加上另一列。然后修改HAVING子句中GROUPING_ID()对新列的调用。这将还需要在PL/SQL块中增加一新变量。在你完成上述工作之后,用必要的测试,由GROUPING()控制完成相同输出替换对GROUPING_ID()的调用。你喜欢这结果么?这是你想要维护的代码么?



0
0
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics