雪花获得之前不同的值

问题描述 投票:0回答:2

我有一个表,可以按 ID 跟踪位置以及设备的发行日期以及与设备关联的产品/配件。该位置一次只能有 1 台设备,并且该设备可能会经常获得新配件。每次关联新配件时,都会创建一条新记录。
我正在尝试将以前的设备信息添加到每条记录中(如果有的话)。

我有一张如下表:

ID_NO 设备_NO DEVICE_DATE 产品编号 产品_日期
FD2A 600076 2011-09-20 210785 2012-01-03
FD2A 208049 2017-09-11 066762 2017-09-11
FD2A 208049 2017-09-11 009802 2023-09-12
C600 202650 2009-03-25 127677 2009-03-25
C600 215580 2012-04-04 127677 2010-10-06
C600 215580 2012-04-04 245791 2012-04-10
C600 215580 2012-04-04 366424 2013-09-06
C600 215580 2012-04-04 105547 2014-01-31
C600 215580 2012-04-04 503592 2015-10-01
C600 209855 2015-11-16 484106 2015-10-09
C600 600382 2020-08-24 347302 2016-08-25

使用以下查询:

select
id_no
,device_no
,device_date
,product_no
,product_date
,lag(device_no) over (partition by id_no order by device_date, product_date) prev_device_no
,lag(device_date) over (partition by id_no order by device_date, product_date) prev_device_date
from device_data
order by id_no,device_date,product_date

我得到以下结果:

ID_NO 设备_NO DEVICE_DATE 产品编号 产品_日期 PREV_DEVICE_NO PREV_DEVICE_DATE
FD2A 600076 2011-09-20 210785 2012-01-03
FD2A 208049 2017-09-11 066762 2017-09-11 600076 2011-09-20
FD2A 208049 2017-09-11 009802 2023-09-12 208049 2017-09-11
C600 202650 2009-03-25 127677 2009-03-25
C600 215580 2012-04-04 127677 2010-10-06 202650 2009-03-25
C600 215580 2012-04-04 245791 2012-04-10 215580 2012-04-04
C600 215580 2012-04-04 366424 2013-09-06 215580 2012-04-04
C600 215580 2012-04-04 105547 2014-01-31 215580 2012-04-04
C600 215580 2012-04-04 503592 2015-10-01 215580 2012-04-04
C600 209855 2015-11-16 484106 2015-10-09 215580 2012-04-04
C600 600382 2020-08-24 347302 2016-08-25 209855 2015-11-16

我真正想做的是获取之前不同的 device_no 和日期: 像这样:

ID_NO 设备_NO DEVICE_DATE 产品编号 产品_日期 PREV_DEVICE_NO PREV_DEVICE_DATE
FD2A 600076 2011-09-20 210785 2012-01-03
FD2A 208049 2017-09-11 066762 2017-09-11 600076 2011-09-20
FD2A 208049 2017-09-11 009802 2023-09-12 600076 2011-09-20
C600 202650 2009-03-25 127677 2009-03-25
C600 215580 2012-04-04 127677 2010-10-06 202650 2009-03-25
C600 215580 2012-04-04 245791 2012-04-10 202650 2009-03-25
C600 215580 2012-04-04 366424 2013-09-06 202650 2009-03-25
C600 215580 2012-04-04 105547 2014-01-31 202650 2009-03-25
C600 215580 2012-04-04 503592 2015-10-01 202650 2009-03-25
C600 209855 2015-11-16 484106 2015-10-09 215580 2012-04-04
C600 600382 2020-08-24 347302 2016-08-25 209855 2015-11-16

还有另一个函数可以在分区时获取最后一个不同的值吗?

sql snowflake-cloud-data-platform lag
2个回答
1
投票

尝试结合 LAG、CTE 和 CASE:

WITH DeviceHistory AS (
  SELECT 
    ID_NO,
    DEVICE_NO,
    DEVICE_DATE,
    PRODUCT_NO,
    PRODUCT_DATE,
    LAG(DEVICE_NO) OVER (PARTITION BY ID_NO ORDER BY DEVICE_DATE, PRODUCT_DATE) AS prev_device_no,
    LAG(DEVICE_DATE) OVER (PARTITION BY ID_NO ORDER BY DEVICE_DATE, PRODUCT_DATE) AS prev_device_date
  FROM your_table
),
FilteredHistory AS (
  SELECT
    ID_NO,
    DEVICE_NO,
    DEVICE_DATE,
    PRODUCT_NO,
    PRODUCT_DATE,
    CASE 
      WHEN prev_device_no IS NOT NULL AND prev_device_no != DEVICE_NO THEN prev_device_no
      ELSE NULL
    END AS prev_diff_device_no,
    CASE 
      WHEN prev_device_date IS NOT NULL AND prev_device_date != DEVICE_DATE THEN prev_device_date
      ELSE NULL
    END AS prev_diff_device_date,
    ROW_NUMBER() OVER (PARTITION BY ID_NO, DEVICE_NO ORDER BY DEVICE_DATE, PRODUCT_DATE) AS rn
  FROM DeviceHistory
)
SELECT
  ID_NO,
  DEVICE_NO,
  DEVICE_DATE,
  PRODUCT_NO,
  PRODUCT_DATE,
  prev_diff_device_no AS PREV_DEVICE_NO,
  prev_diff_device_date AS PREV_DEVICE_DATE
FROM FilteredHistory
QUALIFY 
  rn = 1 OR prev_diff_device_no IS NOT NULL
ORDER BY ID_NO, DEVICE_DATE, PRODUCT_DATE;

1
投票

我想我这里有一些对你有用的东西。 我只用第一个数据组测试了它,所以它可能需要调整。

我所做的是,拿出你的表,找到所有基于

device_no
不同的记录。 我假设
device_no
不会改变,然后变回以前的值......我认为这会打破这个。

一旦我将其添加到

data_delta
CTE 中,那么我就可以执行
asof
连接来获取原始表中每条记录的先前“更改记录”。

完整代码如下所示,但第一个

data
CTE 只是您的一些示例数据:

with data as (
    select *
    from (values('FD2A',600076,'2011-09-20'::date,'2012-01-03'::date),      
    ('FD2A',208049,'2017-09-11'::date,'2017-09-11'::date),
    ('FD2A',208049,'2017-09-11'::date,'2023-09-12'::date)
    ) x (id_no,device_no,device_date,product_date)
), data_delta as (
    select * from data
    qualify row_number() over (partition by id_no, device_no order by device_date, product_date) = 1
)
select d1.*
     , d2.device_no as prev_device_no
     , d2.device_date as prev_device_date
from data d1
asof join data_delta d2
  match_condition(d1.device_date > d2.device_date)
  on (d1.id_no = d2.id_no)
order by d1.product_date
  ;
© www.soinside.com 2019 - 2024. All rights reserved.