在 SQL 中计算累计百分比

问题描述 投票:0回答:1

我有这张桌子(myt):

CREATE TABLE myt (
  name VARCHAR(50),
  food VARCHAR(50),
  d1 INT
);

INSERT INTO myt (name, food, d1) VALUES
('john', 'pizza', 2010),
('john', 'pizza', 2011),
('john', 'cake', 2012),
('tim', 'apples', 2015),
('david', 'apples', 2020),
('david', 'apples', 2021),
('alex', 'cookies', 2005),
('alex', 'cookies', 2006);

  name    food   d1 food_year
  john   pizza 2010      2010
  john   pizza 2011      2011
  john    cake 2012      2012
   tim  apples 2015      2015
 david  apples 2020      2020
 david  apples 2021      2021
  alex cookies 2005      2005
  alex cookies 2006      2006

我编写了以下查询来找出每个名称中每种食物的百分比细分:

WITH FoodCounts AS (
    SELECT name, 
           food, 
           COUNT(*) as food_count
    FROM myt
    GROUP BY name, food
),
TotalCounts AS (
    SELECT name, 
           COUNT(*) as total_count
    FROM myt
    GROUP BY name
)
SELECT fc.name, 
       fc.food, 
       (fc.food_count * 100.0) / tc.total_count as percentage
FROM FoodCounts fc
JOIN TotalCounts tc
ON fc.name = tc.name;


  name    food percentage
  alex cookies  100.00000
 david  apples  100.00000
  john    cake   33.33333
  john   pizza   66.66667
   tim  apples  100.00000

我现在正在尝试修改此查询以找出累积百分比。例如,截至 2011 年,约翰的食物分解情况是怎样的?截至 2012 年,约翰的食物分解情况是多少?

我尝试使用窗口函数编写一系列 CTE 来回答这个问题:

WITH YearlyFoodCounts AS (
    SELECT name, 
           food, 
           food_year,
           COUNT(*) as food_count
    FROM myt
    GROUP BY name, food, food_year
),
CumulativeCounts AS (
    SELECT name, 
           food_year,
           SUM(food_count) OVER (PARTITION BY name ORDER BY food_year) as cumulative_count
    FROM YearlyFoodCounts
)
SELECT yfc.name, 
       yfc.food, 
       yfc.food_year,
       yfc.food_count,
       cc.cumulative_count,
       (yfc.food_count * 100.0) / cc.cumulative_count as percentage
FROM YearlyFoodCounts yfc
JOIN CumulativeCounts cc
ON yfc.name = cc.name AND yfc.food_year = cc.food_year
ORDER BY yfc.name, yfc.food_year;

结果似乎格式正确:

 name    food food_year food_count cumulative_count percentage
  alex cookies      2005          1                1  100.00000
  alex cookies      2006          1                2   50.00000
 david  apples      2020          1                1  100.00000
 david  apples      2021          1                2   50.00000
  john   pizza      2010          1                1  100.00000
  john   pizza      2011          1                2   50.00000
  john    cake      2012          1                3   33.33333
   tim  apples      2015          1                1  100.00000

这是解决这个问题的正确方法吗?

sql db2
1个回答
0
投票

你把这件事搞得太复杂了。它不需要联接或子查询,您可以使用窗口函数在单个级别中完成它。您可以将普通聚合放入窗口函数中,因为窗口函数在正常聚合之后运行。

注:

  • 使用
    ROWS UNBOUNDED PRECEDING
    ,因为默认值是
    RANGE UNBOUNDED PRECEDING
    ,略有不同。
  • 你说每对
    food
    只能有一个
    name, food_year
    。因此,您应该仅按这两列进行聚合。
SELECT
    name, 
    MIN(food) AS food, 
    food_year,
    COUNT(*) as food_count,
    SUM(COUNT(*)) OVER (PARTITION BY name ORDER BY food_year ROWS UNBOUNDED PRECEDING) as cumulative_count,
    COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY name ORDER BY food_year ROWS UNBOUNDED PRECEDING) as percentage
FROM myt
GROUP BY
    name,
    food_year;

db<>小提琴

© www.soinside.com 2019 - 2024. All rights reserved.