在BigQuery中使用“保留”方法(类似于SAS)

问题描述 投票:0回答:1

在 BigQuery 中定义变量时,如何将一行中的值与上面获得的值进行比较(当变量创建仍在进行中时)?

我有一个包含 ID、开始日期和结束日期的表格。我想创建一个 MaxDate 变量,它将比较(从第二行开始)每行的开始日期是否为 <= to "lag" end date, and if the condition is true, then MaxDate should be the maximum of the EndDate and the "previous" EndDate immediately above.

身份证 开始日期 结束日期 最大日期
A 2019-10-25 2019-10-31 2019-10-31
A 2019-10-26 2019-10-26 2019-10-31
A 2019-10-28 2019-10-30 2019-10-31
A 2019-10-29 2019-10-29 2019-10-31

以下方法在第二个滞后行失败:


WITH
S1 AS
(
SELECT ID, 1 AS COUNT
FROM S0
ORDER BY ID, StartDate, EndDate
)

,S2 AS
(
SELECT *,

CASE WHEN StartDate < LAG(EndDate) OVER (PARTITION BY ID ORDER BY StartDate, EndDate)
THEN
  MAX(EndDate) OVER (PARTITION BY ID ORDER BY StartDate, EndDate
    ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
      ELSE EndDate
END AS EndDate2
FROM S1
)

,S3 AS
(SELECT *,

CASE WHEN StartDate > LAG(EndDate2) OVER (PARTITION BY ID ORDER BY StartDate, EndDate) + 1
THEN 
  MAX(COUNT) OVER (PARTITION BY ID ORDER BY StartDate, EndDate
    ROWS 1 PRECEDING) + 1
      ELSE 1
END AS Visit_Count

FROM S2
ORDER BY ID, StartDate, EndDate
)

(
SELECT ID, Visit_Count, MIN(StartDate) AS StartDate, MAX(EndDate) AS EndDate
FROM S3
GROUP BY ID, Visit_Count
ORDER BY ID, Visit_Count
)
;

谢谢!

google-bigquery retain
1个回答
0
投票

弄清楚:

SELECT *,
CASE WHEN StartDate <= LAG(EndDate) 
  OVER (PARTITION BY PATIENT_ID ORDER BY StartDate) 
      THEN MAX(EndDate)
        OVER (PARTITION BY PATIENT_ID ORDER BY StartDate)
            ELSE StartDate
              END AS MaxDate
FROM table
;
© www.soinside.com 2019 - 2024. All rights reserved.