我使用的软件是
snowflake
我有一张公司员工工作分配表。他们可以同时活跃地从事多项工作(
end_date
中为空)。我想将其合并为 1 行:
EMPLOYEE_ID | START_DATE | END_DATE | 工作_标题 |
---|---|---|---|
1442 | 7/30/24 | 导师 | |
1442 | 7/30/24 | 导师 | |
1442 | 6/28/24 | 教学专家 | |
1442 | 5/1/24 | 6/27/24 | 教学专家 |
1442 | 12/16/21 | 7/29/24 | 导师 |
1442 | 12/16/21 | 首席导师 | |
1442 | 12/16/21 | 7/29/24 | 导师 |
如果员工在
end_date
字段中有任何空值,那么我只想检索不同的 job_titles (将前 2 行消除为 1,因为这两个职位在同一开始日期)
1-5 按降序排列,基于 start_date
,如下所示:
EMPLOYEE_ID | 职位_标题_1 | 工作_标题_2 | 工作_标题_3 | 工作_标题_4 | 工作_标题_5 |
---|---|---|---|---|---|
1442 | 导师 | 教学专家 | 首席导师 |
现在假设该员工目前没有活跃的工作,表格将如下所示:
EMPLOYEE_ID | START_DATE | END_DATE | 工作_标题 |
---|---|---|---|
1442 | 5/1/24 | 6/27/24 | 教学专家 |
1442 | 12/16/21 | 7/29/24 | 导师 |
1442 | 12/16/21 | 7/29/24 | 导师 |
在这种情况下,我希望表格看起来像这样:
EMPLOYEE_ID | 职位_标题_1 | 工作_标题_2 | 工作_标题_3 | 工作_标题_4 | 工作_标题_5 |
---|---|---|---|---|---|
1442 | 教学专家 | 导师 |
这是我正在使用的查询,它可以工作,但它没有按 desc start_date 顺序对 job_title 1-5 列进行排序:
WITH job_position_info_ADP AS (
SELECT
'ADP' AS source,
CAST(w.associate_oid AS STRING) AS worker_id,
CAST(w.id AS STRING) AS Employee_ID,
TO_CHAR(wah._fivetran_start, 'MM/DD/YY') AS start_date,
CASE
WHEN wah._fivetran_active = TRUE THEN NULL
ELSE TO_CHAR(wah._fivetran_end, 'MM/DD/YY')
END AS end_date,
wah.job_title AS Job_Title,
ROW_NUMBER() OVER (PARTITION BY CAST(w.id AS STRING) ORDER BY wah._fivetran_start DESC) AS rn
FROM
prod_raw.adp_workforce_now.worker w
JOIN
prod_raw.adp_workforce_now.worker_report_to AS wr
ON w.id = wr.worker_id
JOIN
prod_raw.adp_workforce_now.work_assignment_history AS wah
ON w.id = wah.worker_id
),
recent_jobs_with_null_end AS (
SELECT
Employee_ID,
Job_Title,
ROW_NUMBER() OVER (PARTITION BY Employee_ID ORDER BY start_date DESC) AS rn
FROM
job_position_info_ADP
WHERE
end_date IS NULL
),
recent_jobs_all AS (
SELECT
Employee_ID,
Job_Title,
ROW_NUMBER() OVER (PARTITION BY Employee_ID ORDER BY start_date DESC) AS rn
FROM
job_position_info_ADP
)
SELECT
Employee_ID,
MAX(CASE WHEN rn = 1 THEN Job_Title END) AS Job_Title_1,
MAX(CASE WHEN rn = 2 THEN Job_Title END) AS Job_Title_2,
MAX(CASE WHEN rn = 3 THEN Job_Title END) AS Job_Title_3,
MAX(CASE WHEN rn = 4 THEN Job_Title END) AS Job_Title_4,
MAX(CASE WHEN rn = 5 THEN Job_Title END) AS Job_Title_5
FROM (
SELECT * FROM recent_jobs_with_null_end
UNION ALL
SELECT * FROM recent_jobs_all
WHERE Employee_ID NOT IN (SELECT Employee_ID FROM recent_jobs_with_null_end)
) AS combined
WHERE
Employee_ID = '1442'
GROUP BY
Employee_ID;
这是 SQL Server 中的可复制代码,包括架构和数据。
我们的想法是重新组织 work_assignment_history 表以按日期对职位进行排序,而不包括当前职位(END_DATE 为空)。
然后,在job_unique中,删除重复项,并重新计算行号。发现去重后初始行号不一样,而且日期转成字符导致计算新行号变得困难。
最后,应用PIVOT操作将行转换为列。
;WITH job_position_info_ADP AS (
SELECT
CAST(wah.EMPLOYEE_ID AS varchar(10)) AS Employee_ID,
wah.START_DATE AS start_date,
wah.job_title AS Job_Title,
ROW_NUMBER() OVER (PARTITION BY wah.EMPLOYEE_ID ORDER BY wah.START_DATE DESC) AS rn
FROM
[development].[dbo].work_assignment_history AS wah
WHERE
wah.END_DATE is not null
),
job_unique as
(
select Employee_ID, start_date, Job_Title, max(rn) as rn,
ROW_NUMBER() OVER (PARTITION BY EMPLOYEE_ID ORDER BY start_date DESC) AS new_rn
from job_position_info_ADP
group by Employee_ID, start_date, Job_Title
)
SELECT Employee_ID, [1], [2], [3], [4],[5]
FROM
(
SELECT Employee_ID, job_title,new_rn
FROM job_unique
) AS SourceTable
PIVOT
(
MAX(job_title)
FOR new_rn IN ([1], [2], [3], [4],[5])
) AS PivotTable;
输出
Employee_ID 1 2 3 4 5
1442 Especialista en Instrucción Tutor NULL NULL NULL
我希望它很容易翻译成Snowflake,因为大多数SQL语言都具有语法更改的PIVOT操作。
架构
CREATE TABLE [dbo].[work_assignment_history ](
[EMPLOYEE_ID] [smallint] NOT NULL,
[START_DATE] [date] NOT NULL,
[END_DATE] [date] NULL,
[JOB_TITLE] [nvarchar](50) NOT NULL
) ON [PRIMARY]
GO
数据
EMPLOYEE_ID;START_DATE;END_DATE;JOB_TITLE
1442;7/30/24;;Tutor
1442;7/30/24;;Tutor
1442;6/28/24;;Especialista en Instrucción
1442;5/1/24;6/27/24;Especialista en Instrucción
1442;12/16/21;7/29/24;Tutor
1442;12/16/21;;Instructor Principal
1442;12/16/21;7/29/24;Tutor
注意 由于某种原因,当将 HTML 数据导入 SQL Server 时,它被翻译成西班牙语。抱歉。