我需要在一行中显示上一年和当前年份的值,以获得一组列组合。场景如下:我有一个这样的数据集:
Student City Country Year Month Subject Marks
John Boston USA 2018 01 Maths 90
Mark London UK 2018 01 Maths 95
John Boston USA 2019 01 Maths 95
Mark London UK 2019 01 Maths 83
John Boston USA 2018 01 Arts 90
Mark London UK 2018 01 Arts 95
John Boston USA 2019 01 Arts 95
Mark London UK 2019 01 Arts 83
我希望输出为:
Student City Country Year Month Maths_curr Maths_prev Arts_curr Arts_prev
John Boston USA 2019 01 95 90 95 90
John Boston USA 2018 01 90 null 90 null
Mark London UK 2019 01 83 95 83 95
Mark London UK 2018 01 95 null 95 null
我想,我需要使用LAG函数来实现这个...我使用了这段代码
select student,city,country,year,month,subject,marks as curr,
lag(marks,1)over(partition by student,city,country,subject order by year,month) as prev
from <table>
order by student,city,country,year,month
我得到的输出是:
Student City Countr Year Month Subject Curr Prev
John Boston USA 2019 01 Maths 95 90
John Boston USA 2018 01 Maths 90 null
John Boston USA 2019 01 Arts 95 90
John Boston USA 2018 01 Arts 90 null
Mark London UK 2019 01 Maths 83 95
Mark London UK 2018 01 Maths 95 null
Mark London UK 2019 01 Arts 83 95
Mark London UK 2018 01 Arts 95 null
你能帮助我获得所需的输出...... LEAD还是LAG,在这种情况下使用的正确函数是什么?有没有其他方法可以在Redshift中实现这一目标?
任何帮助是极大的赞赏。
我也试过这个代码..
select student,city,country,year,month,subject,
case when substring(curr,1,1) = 'M' then cast(split_part(curr,' ',2) as integer) end as maths_curr,
case when substring(prev,1,1) = 'M' then cast(split_part(prev,' ',2) as integer) end as maths_prev,
case when substring(curr,1,1) = 'A' then cast(split_part(curr,' ',2) as integer) end as arts_curr,
case when substring(prev,1,1) = 'A' then cast(split_part(prev,' ',2) as integer) end as arts_prev
from
(select student,city,country,year,month,subject,
case when subject = 'MATHS' then 'M ' + cast(nvl(marks,0) as varchar)
else 'A ' + cast(nvl(marks,0) as varchar)
end as curr,
case when subject = 'MATHS' then 'M ' + cast(nvl(lag(marks,1)over (partition by student,city,country,subject order by year,mth),0) as varchar)
else 'A ' + cast(nvl(lag(marks,1)over (partition by student,city,country,subject order by year,mth),0) as varchar)
end as prev
from <table>
order by student,city,country,year,month)
在这里我得到的输出为:
Student City Country Year Month Subject Maths_Curr Maths_Prev Arts_Curr Arts_Prev
John Boston USA 2019 01 Maths 95 90 null null
John Boston USA 2018 01 Maths 90 null null null
John Boston USA 2019 01 Arts null null 95 90
John Boston USA 2018 01 Arts null null 90 null
Mark London UK 2019 01 Maths 83 95 null null
Mark London UK 2018 01 Maths 95 null null null
Mark London UK 2019 01 Arts null null 83 95
Mark London UK 2018 01 Arts null null 95 null
不确定我到底哪里错了..在这里需要一些指导......
这应该做的伎俩:
WITH base AS (
SELECT *,
CASE WHEN "Subject" = 'Maths' THEN "Marks" ELSE NULL END AS maths_current,
CASE WHEN "Subject" = 'Arts' THEN "Marks" ELSE NULL END AS arts_current,
CASE WHEN "Subject" = 'Maths' THEN LAG("Marks") OVER (PARTITION BY "Student","City","Country","Subject" ORDER BY "Year","Month") ELSE NULL END AS previous_math,
CASE WHEN "Subject" = 'Arts' THEN LAG("Marks") OVER (PARTITION BY "Student","City","Country","Subject" ORDER BY "Year","Month") ELSE NULL END AS previous_arts
FROM <table>
)
SELECT "Student",
"City",
"Country",
"Year",
"Month",
MAX(maths_current) AS Maths_curr,
MAX(previous_math) AS Maths_prev,
MAX(arts_current) AS Arts_curr,
MAX(previous_arts) AS Arts_prev
FROM base
GROUP BY 1,2,3,4,5
ORDER BY 1,2,3,4 DESC,5 DESC