我在BigQuery中有一个查询。我想知道超过四分之一的平均值。使用我当前的SQL,id1
的Q1周期值对于id2
是相同的。
这就是我所拥有的,价值观很好:
row|averages|quarter|identifier
-----------------------------
1 | 10 | 1 | id1
2 | 20 | 2 | id1
3 | 30 | 1 | id2
4 | 40 | 2 | id2
这是我为上面的结构编写的SQL,它给出了很好的值:
WITH
index_cal AS (
SELECT
values-01,
kind,
EXTRACT (QUARTER FROM date) as QUARTER,
date,
FROM
`project.dataset.table`,
geom AS (
SELECT
identifier
FROM
`project.dataset.table2` )
SELECT
AVG(values-01) AS averages,
QUARTER AS quarter,
geom. identifier as identifier
FROM
index_cal as g
INNER JOIN
geom
ON
INTERSECTS(g.kind,
geom. identifier)
GROUP BY
identifier
quarter
我想要的是为每个标识符的每个季度分组值,使每个标识符只有1个关联的行:
row | averages | quarter | identifier
----------------------------------
1 | 10 | 1 | id1
| 20 | 2 |
----------------------------------
2 | 30 | 1 | id2
| 40 | 2 |
----------------------------------
为了获得所需的结构,使得id1只有1个关联的行,同样对于所有标识符,我编写了这个SQL查询:
WITH
index_cal AS (
SELECT
values-01,
kind,
EXTRACT (QUARTER FROM date) as QUARTER,
date,
FROM
`project.dataset.table`,
geom AS (
SELECT
identifier
FROM
`project.dataset.table2` )
SELECT
ARRAY(
SELECT
AS STRUCT AVG(values-01) AS averages,
QUARTER AS quarter
FROM
index_cal
GROUP BY
QUARTER ) as INDEX,
geom. identifier as identifier
FROM
index_cal AS g
INNER JOIN
geom
ON
INTERSECTS(g.kind,
geom. identifier)
GROUP BY
identifier
在运行此查询时,我获得按季度分组的所有标识符的平均值,以便对所有标识符重复值(例如,在这种情况下为15和25):
row | averages | quarter | identifier
----------------------------------
1 | 15 | 1 | id1
| 25 | 2 |
----------------------------------
2 | 15 | 1 | id2
| 25 | 2 |
----------------------------------
2 | 15 | 1 | id3
| 25 | 2 |
----------------------------------
我最后想要回答的是基于values-01
的季度区间identifier
的平均值。目前它们对于identifier
的任何值都是相同的。
在给出原始值的原始查询上使用ARRAY_AGG解决了它
with final_cal as (WITH
index_cal AS (
SELECT
values-01,
kind,
EXTRACT (QUARTER FROM date) as QUARTER,
date,
FROM
`project.dataset.table`,
geom AS (
SELECT
identifier
FROM
`project.dataset.table2` )
SELECT
AVG(values-01) AS averages,
QUARTER AS quarter,
geom. identifier as identifier
FROM
index_cal as g
INNER JOIN
geom
ON
INTERSECTS(g.kind,
geom. identifier)
GROUP BY
identifier
quarter)
SELECT identifier, ARRAY_AGG(STRUCT(averages, quarter)) from final_cal GROUP BY identifier