我正在尝试在每一行中获取像这样的
overall average
数字。您能帮助我如何使用 Snowflake SQL 最好地实现这一目标吗?
Overall Average
应该是(300+150+100)/3
身份证 | 描述 | SUB_ID | 价值 | 按 ID 平均 | 整体平均 |
---|---|---|---|---|---|
1 | ABC | 10 | 300 | 300 | 183.33 |
1 | ABC | 20 | 150 | 300 | 183.33 |
1 | ABC | 30 | 450 | 300 | 183.33 |
2 | 防御 | 10 | 150 | 150 | 183.33 |
2 | 防御 | 20 | 150 | 150 | 183.33 |
3 | EFG | 30 | 100 | 100 | 183.33 |
3 | EFG | 10 | 180 | 100 | 183.33 |
3 | EFG | 20 | 20 | 100 | 183.33 |
我能够使用窗口函数得到
Average by ID
AVG (Value) Over (partition by ID)
将
PARTITION BY
更改为常数,例如 null
或 true
select
$1 AS id,
$2 AS value,
avg(value) over (partition by id) as avg_by_id,
avg(value) over (partition by true) as overall_avg
from values
(1,300),
(1,150),
(1,450),
(2,150),
(2,150),
(3,100),
(3,180),
(3,20);
你要求的是平均值,这个值静态地被认为是垃圾值”,所以它是可以计算的,但它通常也是没有意义的。所以应该避免。
执行此操作的其中之一是:
with example_data(id, value) as (
select * from values
(1,300),
(1,150),
(1,450),
(2,150),
(2,150),
(3,100),
(3,180),
(3,20)
)
select a.*
,b.*
from (
select
id,
value,
avg(value) over (partition by id) as avg_by_id,
avg(value) over (partition by true) as overall_avg_correct
from example_data
) as a
cross join (
select avg(id_avg) as avg_avg
from (
select avg(value) as id_avg from example_data group by id
)
) as b;
或者它可以作为基于 SELECT 的子选择来完成:
select
id,
value,
avg(value) over (partition by id) as avg_by_id,
avg(value) over (partition by true) as overall_avg_correct,
(
select avg(id_avg)
from (
select avg(value) as id_avg from example_data group by id
)
) as avg_avg
from example_data
在 SQL 中,这可以通过窗口函数和 OVER PARTITION 来完成
SELECT *
, AVG (Value) OVER (PARTITION BY [ID]) AS [AverageByID]
, AVG (Value) OVER() AS [OverallAverage]
FROM [data]