我需要计算平均价格并将它们分组为2列。然后选择前2个值(PostgreSQL 10.1)。例如,我有以下结构:
------------------------------------------------------------------------------------------
category | shop_name | price | date |
MSI GeForce RTX 2080 |amazon | 62649 | 1/6/2019 |
MSI GeForce RTX 2080 |amazon | 58668 | 1/17/2019 |
MSI GeForce RTX 2080 |amazon | 62649 | 1/7/2019 |
MSI GeForce RTX 2080 |amazon | 60542 | 1/16/2019 |
MSI GeForce RTX 2080 |amazon | 62649 | 1/5/2019 |
MSI GeForce RTX 2080 |brandstar | 66456 | 1/16/2019 |
MSI GeForce RTX 2080 |brandstar | 66347 | 1/17/2019 |
MSI GeForce RTX 2080 |brandstar | 66456 | 1/16/2019 |
MSI GeForce RTX 2080 |brigo | 63300 | 1/17/2019 |
MSI GeForce RTX 2080 |brigo | 65330 | 1/16/2019 |
MSI GeForce RTX 2080 |brigo | 65330 | 1/16/2019 |
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/16/2019 |
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/17/2019 |
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/18/2019 |
假设我想为category和shop_name选择前2个平均结果。所以我希望得到以下结果:
category | shop_name | price | date | avg |
MSI GeForce RTX 2080 |amazon | 62649 | 1/6/2019 | 61431.4 |1
MSI GeForce RTX 2080 |amazon | 58668 | 1/17/2019 | 61431.4 |1
MSI GeForce RTX 2080 |amazon | 62649 | 1/7/2019 | 61431.4 |1
MSI GeForce RTX 2080 |amazon | 60542 | 1/16/2019 | 61431.4 |1
MSI GeForce RTX 2080 |amazon | 62649 | 1/5/2019 | 61431.4 |1
MSI GeForce RTX 2080 |brandstar | 66456 | 1/16/2019 | 66419.66667 | 3
MSI GeForce RTX 2080 |brandstar | 66347 | 1/17/2019 | 66419.66667 | 3
MSI GeForce RTX 2080 |brandstar | 66456 | 1/16/2019 | 66419.66667 | 3
MSI GeForce RTX 2080 |brigo | 63300 | 1/17/2019 | 64653.33333 | 2
MSI GeForce RTX 2080 |brigo | 65330 | 1/16/2019 | 64653.33333 | 2
MSI GeForce RTX 2080 |brigo | 65330 | 1/16/2019 | 64653.33333 | 2
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/16/2019 | 65330 | 1
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/17/2019 | 65330 | 1
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/18/2019 | 65330 | 1
然后我想选择排名小于3的行。
但我得到以下结果:
---------------------------------------------------------------------------------------------
MSI GeForce RTX 2080 |amazon | 62649 | 1/6/2019 | 61431.4 | 1 |
MSI GeForce RTX 2080 |amazon | 58668 | 1/17/2019 | 61431.4 | 1 |
MSI GeForce RTX 2080 |amazon | 62649 | 1/7/2019 | 61431.4 | 1 |
MSI GeForce RTX 2080 |amazon | 60542 | 1/16/2019 | 61431.4 | 1 |
MSI GeForce RTX 2080 |amazon | 62649 | 1/5/2019 | 61431.4 | 1 |
MSI GeForce RTX 2080 |brandstar | 66456 | 1/16/2019 | 66419.66667 | 1 |
MSI GeForce RTX 2080 |brandstar | 66347 | 1/17/2019 | 66419.66667 | 1 |
MSI GeForce RTX 2080 |brandstar | 66456 | 1/16/2019 | 66419.66667 | 1 |
MSI GeForce RTX 2080 |brigo | 63300 | 1/17/2019 | 64653.33333 | 1 |
MSI GeForce RTX 2080 |brigo | 65330 | 1/16/2019 | 64653.33333 | 1 |
MSI GeForce RTX 2080 |brigo | 65330 | 1/16/2019 | 64653.33333 | 1 |
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/16/2019 | 65330 | 1
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/17/2019 | 65330 | 1
MSI GeForce RTX 2070 | fake_shop | 65330 | 1/18/2019 | 65330 | 1
这是我的SQL查询:
SELECT tt.category,
tt.shop_name,
tt.price,
tt.updated,
tt.avg_price,
rank() OVER (PARTITION BY tt.category,
tt.shop_name,
tt.avg_price
ORDER BY tt.avg_price DESC)
FROM
( SELECT category,
LOWER(shop_name) AS shop_name,
CAST (price AS INTEGER) AS price,
DATE(updated) AS updated,
avg(price) OVER (PARTITION BY category,
LOWER(shop_name)) AS avg_price
FROM prices ) AS tt
只需使用AVG() OVER ()
,然后使用DENSE_RANK()
:
WITH cte1 AS (
SELECT *, AVG(price) OVER (PARTITION BY category, shop_name) AS avg_price
FROM prices
), cte2 AS (
SELECT *, DENSE_RANK() OVER (PARTITION BY category ORDER BY avg_price) AS rnk
FROM cte1
)
SELECT *
FROM cte2
WHERE rnk <= 2
ORDER BY category, shop_name
我想你想要:
select tt.category, tt.shop_name, tt.price, tt.updated, tt.avg_price,
dense_rank() over (partition by tt.category order by tt.avg_price desc)
from (select category, lower(shop_name) as shop_name,
(price::int) as price, updated::date as updated,
avg(price) over (partition by category, lower(shop_name)) as avg_price
from prices
) tt
我简化了一些逻辑,但主要的变化是partition by
为rank()
。您似乎想要每个商店的排名。 dense_rank()
也更合适。
如果要区分具有相同超额价格的类别:
dense_rank() over (partition by tt.shop_name order by tt.avg_price desc, category)