`

sparksql中row_number() 的用法

阅读更多

本来使用api窗口函数开发的,但是觉得写成sql更方便,但是发现sparksql中as出来的别名,不能在where中使用,要再套上一层select才可以。

val topDF = spark.sql("select * from (select day, city, cmsId ,count(cmsId) as ts, 
row_number() over(partition by city order by count(cmsId)) as rn "+
      " from data_log  where day='20170511' and cmsType='video'  
 group by city, day,cmsId   order by city, rn  ) T where T.rn<=3 ")

  

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics