我的数据集如下所示:
TEAM_ID PLAYER_ID NUM_POINTS
21 39 20
21 50 10
21 67 10
22 74 0
22 73 0
我想获得数据的子集,其中每个团队都有一个明确的“获胜者”,也就是说,如果我按团队 ID 分组,则恰好有一个玩家比所有其他玩家拥有更多积分。如果团队中两名或更多球员之间存在平局,我不希望他们包含在子集中。我尝试了一个查询,但我在子集中得到的数字或行数似乎太高,所以我认为我犯了一个错误。这是我的查询:
WITH ADD_MAX_POINTS_VALUES_TO_TEAM AS (
SELECT
T1.TEAM_ID,
MAX(T1.NUM_POINTS) AS MAX_POINTS_FOR_TEAM,
FROM MY_TABLE T1
GROUP BY T1.TEAM_ID
), GET_SUBSET AS (
SELECT T1.TEAM_ID
T1.PLAYER_ID
T2.MAX_POINTS_FOR_TEAM
FROM MY_TABLE T1 INNER JOIN ADD_MAX_POINTS_VALUES_TO_TEAM T2
ON T1.B1_BUS_PRTNR_NBR = T2.B1_BUS_PRTNR_NBR
WHERE T1.NUM_POINTS = T2.MAX_POINTS_FOR_TEAM
GROUP BY 1, 2, 3
HAVING COUNT(*) = 1 -- > HERE I AM TRYING TO SAY THERE IS ONE UNIQUE PLAYER ON THE TEAM WITH THE MAX SCORE
),
SELECT COUNT(*) FROM GET_SUBSET
感谢任何帮助,如果我需要提供更多信息,请告诉我。
谢谢!!
我们可以在这里使用
RANK()
窗口函数:
WITH cte1 AS (
SELECT t.*, RANK() OVER (PARTITION BY TEAM_ID ORDER BY NUM_POINTS DESC) rnk
FROM MY_TABLE t
),
cte2 AS (
SELECT TEAM_ID
FROM cte1 t1
WHERE rnk = 1 AND
NOT EXISTS (
SELECT 1
FROM cte1 t2
WHERE t2.TEAM_ID = t1.TEAM_ID AND
t2.rnk = 1
)
)
SELECT t1.TEAM_ID, t1.PLAYER_ID, t1.NUM_POINTS
FROM MY_TABLE t1
INNER JOIN cte2 t2
ON t2.TEAM_ID = t2.TEAM_ID;
第一个 CTE 根据积分为每个团队分配排名。 然后,第二个 CTE 会识别仅具有单一排名 1 记录的团队。