在 SQL 中使用 Lag 和 Lead 函数

问题描述 投票:0回答:1

我有这张关于医疗患者的表格:

CREATE TABLE SAMPLE_DATA (
    patient_name VARCHAR(50),
    
    year INTEGER,
    
    gender CHAR(1),
    
    patient_weight DECIMAL(5,2),
    
    location VARCHAR(20)
);

INSERT INTO SAMPLE_DATA (patient_name, year, gender, patient_weight, location) VALUES
    ('Sarah', 2010, 'F', 65.00, 'hospital'),
    ('Sarah', 2012, 'F', 66.00, 'home'),
    ('Sarah', 2013, 'F', 67.00, 'hospital'),
    
    ('Michael', 2011, 'M', 78.00, 'hospital'),
    ('Michael', 2013, 'M', 76.00, 'home'),
    ('Michael', 2015, 'M', 77.00, 'hospital'),
    
    ('James', 2010, 'M', 82.00, 'home'),
    ('James', 2014, 'M', 80.00, 'hospital'),
    
    ('Emma', 2012, 'F', 70.00, 'hospital'),
    ('Emma', 2013, 'F', 71.00, 'home'),
    ('Emma', 2015, 'F', 71.00, 'hospital'),
    
    ('Robert', 2011, 'M', 88.00, 'hospital'),
    ('Robert', 2014, 'M', 85.00, 'home'),
    
    ('Maria', 2010, 'F', 63.00, 'hospital'),
    ('Maria', 2012, 'F', 64.00, 'home'),
    ('Maria', 2015, 'F', 64.00, 'hospital');

原来是这样的:

patient_name | year | gender | patient_weight | location
-------------|------|--------|----------------|----------
Sarah        | 2010 | F      | 65.00         | hospital
Sarah        | 2012 | F      | 66.00         | home
Sarah        | 2013 | F      | 67.00         | hospital
Michael      | 2011 | M      | 78.00         | hospital
Michael      | 2013 | M      | 76.00         | home
Michael      | 2015 | M      | 77.00         | hospital
James        | 2010 | M      | 82.00         | home
James        | 2014 | M      | 80.00         | hospital
Emma         | 2012 | F      | 70.00         | hospital
Emma         | 2013 | F      | 71.00         | home
Emma         | 2015 | F      | 71.00         | hospital
Robert       | 2011 | M      | 88.00         | hospital
Robert       | 2014 | M      | 85.00         | home
Maria        | 2010 | F      | 63.00         | hospital
Maria        | 2012 | F      | 64.00         | home
Maria        | 2015 | F      | 64.00         | hospital

期望的结果:我想转换此表,使其显示每对测量之间患者发生的情况:

 patient_name | start_year | gender | start_weight | years_until_next | location_change    
-------------|------------|---------|--------------|------------------|-------------------
Sarah        | 2010       | F       | 65.00       | 2               | hospital-home     
Sarah        | 2012       | F       | 66.00       | 1               | home-hospital     
Michael      | 2011       | M       | 78.00       | 2               | hospital-home     
Michael      | 2013       | M       | 76.00       | 2               | home-hospital     
James        | 2010       | M       | 82.00       | 4               | home-hospital     
Emma         | 2012       | F       | 70.00       | 1               | hospital-home     
Emma         | 2013       | F       | 71.00       | 2               | home-hospital     
Robert       | 2011       | M       | 88.00       | 3               | hospital-home     
Maria        | 2010       | F       | 63.00       | 2               | hospital-home     
Maria        | 2012       | F       | 64.00       | 3               | home-hospital

我是 SQL 中 LAG 和 LEAD 函数的新手,我尝试执行以下操作:

WITH next_measurements AS (
    SELECT 
        patient_name,
        year as start_year,
        gender,
        patient_weight as start_weight,
        location as start_location,
        LEAD(year) OVER (
            PARTITION BY patient_name 
            ORDER BY year
        ) as next_year,
        LEAD(location) OVER (
            PARTITION BY patient_name 
            ORDER BY year
        ) as next_location
    FROM sample_data
)
SELECT 
    patient_name,
    start_year,
    gender,
    start_weight,
    (next_year - start_year) as years_until_next,
    LOWER(start_location) || '-' || LOWER(next_location) as location_change
FROM next_measurements
WHERE next_year IS NOT NULL
ORDER BY patient_name, start_year;

代码似乎可以工作:

 patient_name start_year gender start_weight years_until_next location_change
         Emma       2012      F           70                1   hospital-home
         Emma       2013      F           71                2   home-hospital
        James       2010      M           82                4   home-hospital
        Maria       2010      F           63                2   hospital-home
        Maria       2012      F           64                3   home-hospital
      Michael       2011      M           78                2   hospital-home
      Michael       2013      M           76                2   home-hospital
       Robert       2011      M           88                3   hospital-home
        Sarah       2010      F           65                2   hospital-home
        Sarah       2012      F           66                1   home-hospital

这是使用这些功能的正确方法吗?

sql db2
1个回答
0
投票

Lag: 用于访问前一行的数据或信息。

Lead: 用于访问后续/后续行中的数据或信息。

LagLead 都可以帮助您通过访问多行数据来执行比较,而无需使用自连接,并且可用于将当前数据与上一行或下一行进行比较。

您的查询正在使用基于年份和位置的 Lead 从下一行正确检索数据,并查找时间间隙和位置变化。

但是我没有看到Lag在您的查询中的任何地方使用,并且我认为您想要的结果不需要它,但是如果您也使用两者来获得先前的更改,那么它看起来会是这样的:

WITH next_measurements AS (
    SELECT 
        patient_name,
        year as start_year,
        gender,
        patient_weight as start_weight,
        location as start_location,
        LEAD(year) OVER (
            PARTITION BY patient_name 
            ORDER BY year
        ) as next_year,
        LEAD(location) OVER (
            PARTITION BY patient_name 
            ORDER BY year
        ) as next_location ,
        LAG(year) OVER (
            PARTITION BY patient_name 
            ORDER BY year
        ) as prior_year,
        LAG(location) OVER (
            PARTITION BY patient_name 
            ORDER BY year
        ) as prior_location

    FROM sample_data
)
SELECT 
    patient_name,
    start_year,
    gender,
    start_weight,
    (next_year - start_year) as years_until_next,
    LOWER(start_location) || '-' || LOWER(next_location) as location_change ,

    (start_year - prior_year) as years_until_prior,
    LOWER(prior_location) || '-' || LOWER(start_location) as prior_location_change 

FROM next_measurements
WHERE next_year IS NOT NULL
ORDER BY patient_name, start_year;

官方文档链接: https://www.ibm.com/docs/en/db2/12.1?topic=expressions-olap-specification#sdx-synid_frag-lag-function

示例: https://www.ibm.com/docs/en/psfa/7.1.0?topic=functions-lag-lead-family-syntax

© www.soinside.com 2019 - 2024. All rights reserved.