当条件或过滤条件匹配时从前导记录中获取数据并忽略当前行和所选记录(前导行)之间的记录

问题描述 投票:0回答:1

如何在 SQL Server 中编写 CTE 以获取 Type = NA 的人员的前一行的日期。

如果前面有多个行带有 NA,则按 dt_eff asc 排序时取第一行。

如果在记录开始和 type = NA 之间存在任何其他类型,则应忽略这些记录。仅当上一类型结束时才考虑同一人的其他类型(人员 124 场景)。

源数据

Person    Type     dt_eff
123       A        2018-10-23 <Start of record >
123       NA       2018-11-19 <Should be the end date for above and dont select this in output> 
123       NA       2018-12-25 <dont select this in output>
124       A        2020-01-01 <Start of record >
124       B        2020-02-15 <Ignore and dont select in output>
124       NA       2020-05-14 <Should be the end date for start of record and dont select in op>
124       C        2020-10-13 <As the above start record has ended this should be new start>
124       NA       2021-01-15 <should be the end date for second start record>
124       A        2021-05-22 <As the above start record has ended this should be new start>
124       T        2021-08-22 <Ignored and dont select in output>
456       NA       2022-04-19 <Ignore as there is no lag record with valid type>
456       A        2022-05-01 <Start of record and null as end date as there is no type = NA>
456       B        2022-07-15 <Ignore>

预期产出

Person    Type     dt_start     dt_end
123       A        2018-10-23   2018-11-19
124       A        2020-01-01   2020-05-14
124       C        2020-10-13   2021-01-15
124       A        2021-05-22   NULL
456       A        2022-05-01   NULL

上述源数据的DML和DDL

CREATE TABLE Person (
  Person INTEGER,
  Type VARCHAR(3),
  dt_eff Date
);

INSERT INTO Person (Person, Type, dt_eff)
VALUES
(123,'A','2018-10-23'),
(123,'NA','2018-11-19'),
(123,'NA','2018-12-25'),
(124,'A','2020-01-01'),
(124,'B','2020-02-15'),
(124,'NA','2020-05-14'),
(124,'C','2020-10-13'),
(124,'NA','2021-01-15'),
(124,'A','2021-05-22'),
(124,'T','2021-08-22'),
(456,'NA','2022-04-19'),
(456,'A','2022-05-01'),
(456,'B','2022-07-15')

尝试

with cte1 as (
  select *
    , lead(dt_eff) over (partition by Person order by dt_eff) dt_eff_lead
    , lag(Type, 1, Type) over (partition by Person order by dt_eff) type_lag
  from Person
), cte2 as (
  select Person, Type, dt_eff Start_Date
    , dt_eff_lead
    , sum(case when Type <> type_lag and type='NA' then 1 else 0 end)
      over (partition by person order by dt_eff asc
        rows between unbounded preceding and current row) TypeGroup
  from cte1
)
select Person, Type, Start_Date as dt_start
  , max(dt_eff_lead) over (partition by Person, TypeGroup) dt_end
from cte2
where Type<>'NA'
order by Person, Start_Date, Type;
sql sql-server t-sql sql-server-2012
1个回答
0
投票

您可以通过运行 NA 行总数来进行分组:

SELECT  Person, Type
,   dt_start, CASE WHEN cntNA > 0 THEN dt_end END AS dt_end
FROM    (
    SELECT  MIN(dt_eff) OVER(PARTITION BY person, grouping) AS dt_start
    ,   MAX(dt_eff) OVER(PARTITION BY person, grouping) AS dt_end
    ,   COUNT(CASE WHEN type = 'NA' THEN 1 END) OVER(PARTITION BY person, grouping) AS cntNA
    ,   ROW_NUMBER() OVER(PARTITION BY person, grouping ORDER BY dt_eff) AS startrow
    ,   *
    FROM    (
        SELECT  *
        ,   COUNT(CASE WHEN Type = 'NA' THEN 1 END) OVER(PARTITION BY Person ORDER BY dt_eff ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS grouping
        FROM
        (
            VALUES  (123, N'A', N'2018-10-23')
            ,   (123, N'NA', N'2018-11-19')
            ,   (123, N'NA', N'2018-12-25')
            ,   (124, N'A', N'2020-01-01')
            ,   (124, N'B', N'2020-02-15')
            ,   (124, N'NA', N'2020-05-14')
            ,   (124, N'C', N'2020-10-13')
            ,   (124, N'NA', N'2021-01-15')
            ,   (124, N'A', N'2021-05-22')
            ,   (124, N'T', N'2021-08-22')
            ,   (456, N'NA', N'2022-04-19')
            ,   (456, N'A', N'2022-05-01')
            ,   (456, N'B', N'2022-07-15')
        ) t (Person,Type,dt_eff)
    ) x
) x
WHERE   startrow = 1
AND type <> 'NA'
  1. COUNT(CASE WHEN Type = 'NA' THEN 1 END) OVER(PARTITION BY Person ORDER BY dt_eff ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
    创建上述分组值。
    AND 1 PRECEDING
    表示第一个 NA 计入前一行,这就是您正在寻找的分组方式。

  2. 然后您创建每个分组的最小/最大日期以及第一行,因为您只对每个组的一行感兴趣。

    cntNA
    包含组中 NA 的数量,因为您需要将那些没有任何 NA 的 dt_end 设为 NULL。

  3. 最后,您选择要查找的内容。

    CASE WHEN cntNA > 0 THEN dt_end END
    创建开放式日期

© www.soinside.com 2019 - 2024. All rights reserved.