我正在努力寻找入学方面的差距,并设置一个这样的表格:
身份证 | 入学_月 | 连续_月 |
---|---|---|
1 | 202403 | 1 |
1 | 202404 | 2 |
1 | 202405 | 3 |
1 | 202409 | 1 |
1 | 202410 | 2 |
1 | 202411 | 3 |
2 | 202401 | 1 |
2 | 202402 | 2 |
2 | 202407 | 1 |
2 | 202408 | 2 |
以 ID 1 为例,目标是找出 202405(序列中的最大连续月份数)和 202409 之间的注册月份之间的差异,其中每个 ID 的“连续月份”计数重新回到 1。有办法做到这一点吗?
是否有另一种方法可以在不使用连续月份列的情况下计算这样的差距?
谢谢!!
参见示例 测试数据:
身份证 | 入学_月份 | 连续_月 |
---|---|---|
1 | 202403 | 1 |
1 | 202404 | 2 |
1 | 202405 | 3 |
1 | 202407 | 1 |
1 | 202409 | 1 |
1 | 202410 | 2 |
1 | 202411 | 3 |
1 | 202503 | 1 |
1 | 202504 | 1 |
2 | 202401 | 1 |
2 | 202402 | 2 |
2 | 202407 | 1 |
2 | 202408 | 2 |
7 | 202410 | 1 |
7 | 202411 | 2 |
7 | 202412 | 3 |
7 | 202501 | 4 |
8 | 202410 | 1 |
8 | 202501 | 1 |
8 | 202502 | 2 |
9 | 202412 | 1 |
9 | 202501 | 2 |
with A as (
select ID,Enrollment_month currMonth
,lag(Enrollment_Month,1,Enrollment_Month)over(partition by ID order by Enrollment_Month) prevMonth
from test
)
,B as (
select *
,(currMonth/100-prevMonth/100)*12+(currMonth%100-prevMonth%100) dif
,sum(case when (currMonth/100-prevMonth/100)*12+(currMonth%100-prevMonth%100)>1 then 1 else 0 end)
over(partition by ID order by currMonth) seqN
from A
)
select *
,row_number()over(partition by ID,seqN order by currMonth) ConsecutiveMonths
from B
身份证 | 当前月份 | 上个月 | 差异 | seqN | 连续几个月 |
---|---|---|---|---|---|
1 | 202403 | 202403 | 0 | 0 | 1 |
1 | 202404 | 202403 | 1 | 0 | 2 |
1 | 202405 | 202404 | 1 | 0 | 3 |
1 | 202407 | 202405 | 2 | 1 | 1 |
1 | 202409 | 202407 | 2 | 2 | 1 |
1 | 202410 | 202409 | 1 | 2 | 2 |
1 | 202411 | 202410 | 1 | 2 | 3 |
1 | 202503 | 202411 | 4 | 3 | 1 |
1 | 202504 | 202503 | 1 | 3 | 2 |
2 | 202401 | 202401 | 0 | 0 | 1 |
2 | 202402 | 202401 | 1 | 0 | 2 |
2 | 202407 | 202402 | 5 | 1 | 1 |
2 | 202408 | 202407 | 1 | 1 | 2 |
7 | 202410 | 202410 | 0 | 0 | 1 |
7 | 202411 | 202410 | 1 | 0 | 2 |
7 | 202412 | 202411 | 1 | 0 | 3 |
7 | 202501 | 202412 | 1 | 0 | 4 |
8 | 202410 | 202410 | 0 | 0 | 1 |
8 | 202501 | 202410 | 3 | 1 | 1 |
8 | 202502 | 202501 | 1 | 1 | 2 |
9 | 202412 | 202412 | 0 | 0 | 1 |
9 | 202501 | 202412 | 1 | 0 | 2 |
select *
,row_number()over(partition by ID,seqN order by currMonth) ConsecutiveMonths
from (
select *
,(currMonth/100-prevMonth/100)*12+(currMonth%100-prevMonth%100) dif
,sum(case when (currMonth/100-prevMonth/100)*12+(currMonth%100-prevMonth%100)>1 then 1 else 0 end)
over(partition by ID order by currMonth) seqN
from (
select ID,Enrollment_month currMonth
,lag(Enrollment_Month,1,Enrollment_Month)over(partition by ID order by Enrollment_Month) prevMonth
from test
)A
)B