`
hot66hot
  • 浏览: 444249 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

How to select the first least max top N rows from each group in SQL

阅读更多
从分组中选择极值查询/前N项查询 是经常会遇到的问题 ,下面通过简单举例展示这种SQL的写法

举例表
type variety price
apple gala 2.79
apple fuji 0.24
apple limbertwig 2.87
orange valencia 3.59
orange navel 9.36
pear bradford 6.05
pear bartlett 2.14
cherry bing 2.55
cherry chelan 6.33


Selecting the one minimum row from each group

期望结果
type variety price
apple fuji 0.24
orange valencia 3.59
pear bartlett 2.14
cherry bing 2.55



方法一 通过分组子查询实现

select f.type, f.variety, f.price
from (
select type, min(price) as minprice
from fruits group by type
) as x inner join fruits as f on f.type = x.type and f.price = x.minprice;


结果:

type variety price
apple fuji 0.24
cherry bing 2.55
orange valencia 3.59
pear bartlett 2.14


方法二 通过关联子查询实现


select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type);


结果:

type variety price
apple fuji 0.24
orange valencia 3.59
pear bartlett 2.14
cherry bing 2.55


以上两个查询是等价的.

Select the top N rows from each group

每组前N个查询是比较痛苦的问题,因为聚合函数只返回一个值,所以通过聚集函数分组查询前几个数据是不可能的.

比方说,我要选择每个类型最便宜的两个水果。
可以通过变换SQL写法实现

方法一:

select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type)
or price = (select min(price) from fruits as f where f.type = fruits.type
and price > (select min(price) from fruits as f2 where f2.type = fruits.type));


结果:
type variety price
apple gala 2.79
apple fuji 0.24
orange valencia 3.59
orange navel 9.36
pear bradford 6.05
pear bartlett 2.14
cherry bing 2.55
cherry chelan 6.33

这种大量子查询方法性能很差,如果是前3个,前4个等等 这种查询变得不可实现


方法二 从每个品种的水果,品种不超过第二便宜的查询,通过关联子查询实现

select type, variety, price
from fruits
where (
 select count(1) from fruits as f
 where f.type = fruits.type and f.price < fruits.price
) <= 2;


第二种方法在fruits表很大时效果不佳

方法三

可以使用union all 实现 (union all 与 union 的区别是 前者不会通过排序消除重复)

(select * from fruits where type = 'apple' order by price limit 2)
union all
(select * from fruits where type = 'orange' order by price limit 2)
union all
(select * from fruits where type = 'pear' order by price limit 2)
union all
(select * from fruits where type = 'cherry' order by price limit 2)


如果分组(这里是水果种类)数量不大/分页的情况下,可以使用union把数据切成多段分开查询,在好的索引支持下效率很高.

(测试中发现如果去掉每段查询的圆括号则limit 限制整个结果集的返回行数 而不是每段.)

使用union all 联合查询是解决N+1 问题的利器(特别是1 - n 中 n方数据量特别大时),不过JPA 框架在处理union all 查询时有bug (悲催啊)

h3. 实际项目使用情况:

   CRM中商家页面 商家与分店是典型的N+1查询问题,由于有些商家分店数比较多,页面中只展示前3个分店,此时可以使用union all 查询优化 , 通过一条SQL 返回所有商家的前3家分店.

实例参考自Baron Schwartz的博客:http://www.xaprb.com/blog
分享到:
评论

相关推荐

    微软内部资料-SQL性能优化5

    In a nonclustered index, the leaf level contains each index key, plus a bookmark that tells SQL Server where to find the data row corresponding to the key in the index. A bookmark can take one of two ...

    Expert T-SQL Window Functions in SQL Server(Apress,2015)

    Expert T-SQL Window Functions in SQL Server takes you from any level of knowledge of windowing functions and turns you into an expert who can use these powerful functions to solve many T-SQL queries....

    微软内部资料-SQL性能优化3

    An intent lock indicates that SQL Server wants to acquire a shared (S) lock or exclusive (X) lock on some of the resources lower down in the hierarchy. For example, a shared intent lock placed at the ...

    SQL.and.Relational.Theory.How.to.Write.Accurate.SQL.Code.3rd.Edition

    Could you write an SQL query to find employees who have worked at least once in every programming department in the company? And be sure it’s correct? Why is proper column naming so important? Nulls ...

    MMATrans.rar_Grid map_The First_z map_z-map

    These are the rows of Z values of the grid, organized in row order. Each row has a constant Y coordinate. Grid row 1 corresponds to ylo and the last grid row corresponds to yhi. Within each row, the ...

    Foundations for Analytics with Python O-Reilly-2016-Clinton W. Brownley

    The example also shows how to store information separately in both a list and a dictionary in order to create the header row and the data rows for the output file. This is a reminder that you can ...

    Beginning T-SQL with Microsoft SQL Server 2005 and 2008

    Transact-SQL, or T-SQL, is Microsoft Corporation’s powerful implementation of the ANSI standard SQL database query language, which was designed to retrieve, manipulate, and add data to relational ...

    Introducing SQL Server(Apress,2015)

    You'll learn how databases work and how to use the T-SQL language by practicing on one of the most widely-used and powerful database engines in the corporate world: Microsoft SQL Server. Do you ...

    SSD7 选择题。Multiple-Choice

    (b) placing SQL keywords, such as select, under the column names they want to retrieve (c) typing a syntactically correct SQL query that uses column and table names similar to the correct column and...

    The Triangle

    The first line contains one integer N: the number of rows in the triangle. The following N lines describe the data of the triangle. The number of rows in the triangle is &gt; 1 but &lt;= 100. The numbers ...

    Hands-On Database An IntroductIon to Database design and development 2nd

    b: sql server express An overview of how to use the SQL Server Management Studio to create and access databases in SQL Server Express. c: visio An overview of the Visio environment, with a special ...

    Sybex - Mastering Visual Basic .NET Transact SQL (VBL).pdf

    ment, which allows you to calculate totals on groups of the rows retrieved from the database. This statement looks and feels very much like the straight SQL statements, yet it’s not part of standard ...

    oracle advanced sql 高级SQL教程 ORACLE官方教材

    Walking the Tree: From the Top Down 5-9 Ranking Rows with the LEVEL Pseudocolumn 5-10 Formatting Hierarchical Reports Using LEVEL and LPAD 5-11 Pruning Branches 5-13 Summary 5-14 Practice 5 Overview 5...

    Generic_SQL_Query

    针对不同数据库厂商下标准下的sql查询 While polling, JDBC adapter selects all the events from the event table with the event status as zero....Below example gets first 10 rows from the database table

    The triangle

    The first line contains one integer N: the number of rows in the triangle. The following N lines describe the data of the triangle. The number of rows in the triangle is &gt; 1 but &lt;= 100. The numbers ...

    SQL Server 2016 Developer's Guide

    Use the new security features to encrypt or to have more granular control over access to rows in a table Simplify performance troubleshooting with Query Store Discover the potential of R's integration...

    mysql SELECT FOUND_ROWS()与COUNT()用法区别1

    那么,在SELECT查询中包含SQL_CALC_FOUND_ROWS选项,然后执行FOUND_ROWS()就可以了:代码如下mysql&gt; SELECT SQL_

    SSD7 EX1 答案

    In the file Rel-ops.txt, list which relational operations you used, from among the select/project/join operations, in order to perform this query. Explain the role of each operation in your query. 4...

    ora分析脚本

    - ash_sql &lt;sql_id&gt; Show all ash rows group by sampli_time and event for the specified sql_id - [-u ] degree degree of objects for a given user - [-u ] colstats stats for each table, column - [-u ]...

    sqlrows:分析器:通过报告有关* sql.Rows使用错误的诊断信息,帮助发现错误

    sqlrows sqlrows是一个静态代码分析器,它通过报告sql.Rows使用错误的诊断信息来帮助发现错误。 安装 您可以通过go get命令获取sqlrows 。 $ go get -u github.com/gostaticanalysis/sqlrows 快速开始 当Go为1.12...

Global site tag (gtag.js) - Google Analytics