BigQuery重复的rank()数字

问题描述 投票:0回答:3

我有一个带有account_no,order_id,start_date,end_date和排名的交易表。我正在尝试根据交易的开始日期和结束日期对交易进行排名。但是问题是所有交易的开始日期和结束日期都差不多,我无法根据交易日期对交易进行排名。

我的代码

select distinct account_id,order_id,order_validfrom_date as start_date,order_validto_date as end_date, 
 rank() OVER (PARTITION BY account_id ORDER BY order_validfrom_date desc ,order_validto_date desc  ) AS ranking, 

from  `datamart_dimsum.rpt_dly_dimsum_subscription_details` 

where order_validfrom_date <= '2020-01-14'  and  account_id in (216223
)  order by account_id, order_id,order_validfrom_date,order_validto_date  

输出

account_id | order_id |  start_date  | end_date   | ranking
  216223     482847      2017-10-09    2017-11-08      1
  216223     472121      2017-10-09    2017-11-08      1
  216223     312312      2017-10-09    2017-11-08      1

尽管开始日期和结束日期相同,有没有办法将第一笔交易排名为1?我曾尝试过ROW_NUMBER()函数,但失败了。

sql google-bigquery rank
3个回答
0
投票

您是否尝试过在ORDER BY子句中添加order_id?

select distinct account_id,order_id,order_validfrom_date as start_date,order_validto_date as end_date, 
 rank() OVER (PARTITION BY account_id ORDER BY order_validfrom_date desc ,order_validto_date, order_id desc  ) AS ranking, 

from  `datamart_dimsum.rpt_dly_dimsum_subscription_details` 

where order_validfrom_date <= '2020-01-14'  and  account_id in (216223
)  order by account_id, order_id,order_validfrom_date,order_validto_date

0
投票

使用row_number()rank()应该返回重复项:

 row_number() over (partition by account_id
                    order by order_validfrom_date desc, order_validto_date desc
                   ) as ranking, 

0
投票

下面是BigQuery标准SQL的内容>>

我为您看到两个同样合理的选择

选项1-只需添加另一个字段作为平局决胜者

在您的情况下,order_id看起来应该可以工作,因为它很可能在您的表中是唯一的-因此以下内容应该可以工作

#standardSQL
SELECT DISTINCT 
  account_id,
  order_id,
  order_validfrom_date AS start_date,
  order_validto_date AS end_date, 
  RANK() OVER(
    PARTITION BY account_id 
    ORDER BY order_validfrom_date DESC, order_validto_date DESC, order_id DESC -- added order_id here < this is the only change 
  ) AS ranking, 
FROM `datamart_dimsum.rpt_dly_dimsum_subscription_details` 
WHERE order_validfrom_date <= '2020-01-14'  
AND account_id IN (216223)
ORDER BY account_id, order_id DESC, order_validfrom_date, order_validto_date    

选项2-只需将RANK替换为ROW_NUMBER,如下例所示>>

#standardSQL
SELECT DISTINCT 
  account_id,
  order_id,
  order_validfrom_date AS start_date,
  order_validto_date AS end_date, 
  ROW_NUMBER() OVER( -- ROW_NUMBER instead of RANK  is the only change here
    PARTITION BY account_id 
    ORDER BY order_validfrom_date DESC, order_validto_date DESC
  ) AS ranking, 
FROM `datamart_dimsum.rpt_dly_dimsum_subscription_details` 
WHERE order_validfrom_date <= '2020-01-14'  
AND account_id IN (216223)
ORDER BY account_id, order_id DESC, order_validfrom_date, order_validto_date     

这两个选项都将在下面的输出中显示>

Row account_id  order_id    start_date  end_date    ranking  
1   216223      482847      2017-10-09  2017-11-08  1    
2   216223      472121      2017-10-09  2017-11-08  2    
3   216223      312312      2017-10-09  2017-11-08  3    
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.