对表中数据分组,有时只需要某列的聚合值;有时却需要返回整行数据,常用的方法有:子查询、ROW_NUMBER、APPLY,总体感觉还是ROW_NUMBER比较直观。
测试数据:
if OBJECT_ID ( ' testGroup ' ) is not null drop table testGroup GO create table testGroup(ID int identity primary key ,UserID int ,OrderID int ) GO insert testGroup select 1 , 10 union all select 1 , 20 union all select 1 , 30 union all select 2 , 100 union all select 2 , 200 union all select 3 , 1000 union all select 3 , 2000 union all select 3 , 3000 union all select 3 , 4000
一. 取分组中第1行(最大/最小值)
1. 取出分组中某列最大/最小值,不要求显示其他列
最常见的分组聚合,用group by 分组时,只有参加分组/聚合的列才可以被显示。
select UserID, MAX (OrderID) as MaxOrderID from testGroup group by UserID
2. 取出分组中某列最大/最小值,要求显示其他列
要显示表中其他列,用group by 不好实现,可以借助子查询。
select * from testGroup a where ID = ( select MAX (ID) from testGroup b where a.UserID = b.UserID) order by ID -- 或者 select * from testGroup where ID in ( select MAX (ID) from testGroup group by UserID) -- 或者 select * from testGroup as a where a.ID in ( select top 1 ID from testGroup b where a.UserID = b.UserID order by b.OrderID desc ) -- 或者 select * from testGroup a where not exists ( select 1 from testGroup b where a.UserID = b.UserID and a.OrderID < b.OrderID) -- 或者 select * from testGroup a where ( select count ( 1 ) from testGroup b where a.UserID = b.UserID and a.id <= b.id) = 1
二. 取分组中前N行(排名前几名)
前N行为正向排序(ASC),后N行改为反向排序(DESC)即可,N=1时也就是取最大/最小值的行。下面以前2名(N=2)为例。
1. SQL Server 2000的写法
(1)子查询
select * from testGroup as a where a.ID in ( select top 2 ID from testGroup b where a.UserID = b.UserID order by b.OrderID) -- 或者 select * from testGroup a where not exists ( select 1 from testGroup b where a.UserID = b.UserID and a.OrderID > b.OrderID having count ( 1 ) >= 2 ) -- 或者 select * from testGroup a where ( select count ( 1 ) from testGroup b where a.UserID = b.UserID and a.ID >= b.ID) <= 2 -- 没有唯一标识的表,可以用checksum来标识每行 select * from testGroup as a where checksum( * ) in ( select top 2 checksum( * ) from testGroup b where a.UserID = b.UserID order by b.OrderID)
2. SQL Server 2005新语法
(2) ROW_NUMBER()
select ID, UserID, OrderID from ( select * , ROW_NUMBER() over (partition by UserID order by OrderID) num from testGroup ) t where t.num between 1 and 2
(3) APPLY(TOP)
select distinct t. * from testGroup a cross apply ( select top 2 ID, UserID, OrderID from testGroup b where a.UserID = b.UserID order by b.OrderID) as t
三. 取分组中第N行(排名第N名)
把上面的查询中,范围值都改为固定值,就可以取具体某一行了,下面以第3名(N=3)为例。
(1) 子查询
select * from testGroup a where ( select count ( 1 ) from testGroup b where a.UserID = b.UserID and a.OrderID >= b.OrderID) = 3 -- 或者 select * from testGroup a where exists ( select 1 from testGroup b where a.UserID = b.UserID and a.OrderID >= b.OrderID having count ( 1 ) = 3 )
(2) ROW_NUMBER()
select ID, UserID, OrderID from ( select * , ROW_NUMBER() over (partition by UserID order by OrderID) num from testGroup ) t where t.num = 3