我有一个数据包含user_ids
,visitStartTime
和product prices
,已被用户查看。我尝试获取每个用户访问的平均价格和最高价格,但我的查询不对分区进行计算(user + visitStartTime),它仅通过user_id
分区进行计算。
这是我的查询:
select distinct fullVisitorId ,visitStartTime,
avg(pr) over (partition by visitStartTime,fullVisitorId) as avgPrice,
max(pr) over (partition by fullVisitorId,visitStartTime) as maxPrice
from dataset
这是我得到的:
+-----+----------------------+-----------------+----------+----------+--+
| Row | fullVisitorId | visitStartTi | avgPrice | maxPrice | |
+-----+----------------------+-----------------+----------+----------+--+
| 1 | 64217461724617261 | 1538478049 | 484.5 | 969.0 | |
| 2 | 64217461724617261 | 1538424725 | 484.5 | 969.0 | |
+-----+----------------------+-----------------+----------+----------+--+
我的查询中缺少什么?
样本数据
+---------------+----------------+---------------+
| FullVisitorId | VisitStartTime | ProductPrice |
+---------------+----------------+---------------+
| 123 | 72631241 | 100 |
| 123 | 72631241 | 250 |
| 123 | 72631241 | 10 |
| 123 | 73827882 | 70 |
| 123 | 73827882 | 90 |
+---------------+----------------+---------------+
期望的结果:
+-----+---------------+--------------+----------+----------+
| Row | fullVisitorId | visitStartTi | avgPrice | maxPrice |
+-----+---------------+--------------+----------+----------+
| 1 | 123 | 72631241 | 120.0 | 250.0 |
| 2 | 123 | 73827882 | 80.0 | 90.0 |
+-----+---------------+--------------+----------+----------+
在这种情况下,您不需要“分区依据”。
试试这个:
select fullVisitorId ,visitStartTime, avg(ProductPrice) avgPrice ,max(ProductPrice) maxPrice
from sample
group by FullVisitorId,VisitStartTime;
(查询非常标准,所以我认为你可以在BigQuery中使用它)
这是使用PostgreSQL的输出:DB<>FIDDLE
更新
也适用于BigQuery Standard SQL:
#standardSQL
SELECT
FullVisitorId,
VisitStartTime,
AVG(ProductPrice) as avgPrice,
MAX(ProductPrice) as maxPrice
FROM `project.dataset.table`
GROUP BY FullVisitorId, VisitStartTime
如果你想测试它:
#standardSQL
WITH `project.dataset.table` AS (
SELECT 123 FullVisitorId, 72631241 VisitStartTime, 100 ProductPrice
UNION ALL SELECT 123, 72631241, 250
UNION ALL SELECT 123, 72631241, 10
UNION ALL SELECT 123, 73827882, 70
UNION ALL SELECT 123, 73827882, 90
)
SELECT
FullVisitorId,
VisitStartTime,
AVG(ProductPrice) as avgPrice,
MAX(ProductPrice) as maxPrice
FROM `project.dataset.table`
GROUP BY FullVisitorId, VisitStartTime