不确定要调用什么函数,但转置是我能想到的最接近的东西。
我在 BigQuery 中有一个表,其配置如下:
但是我想查询一个配置如下的表:
创建此表的 SQL 代码是什么样的?
谢谢!
2021 年更新:
BigQuery 中引入了新的 UNPIVOT 运算符。
在使用 UNPIVOT 将 Q1、Q2、Q3、Q4 旋转为销售额和季度列之前:
产品 | Q1 | Q2 | Q3 | Q4 |
---|---|---|---|---|
羽衣甘蓝 | 51 | 23 | 45 | 3 |
苹果 | 77 | 0 | 25 | 2 |
使用 UNPIVOT 将 Q1、Q2、Q3、Q4 旋转为销售额和季度列后:
产品 | 销售 | 季度 |
---|---|---|
羽衣甘蓝 | 51 | Q1 |
羽衣甘蓝 | 23 | Q2 |
羽衣甘蓝 | 45 | Q3 |
羽衣甘蓝 | 3 | Q4 |
苹果 | 77 | Q1 |
苹果 | 0 | Q2 |
苹果 | 25 | Q3 |
苹果 | 2 | Q4 |
查询:
WITH Produce AS (
SELECT 'Kale' as product, 51 as Q1, 23 as Q2, 45 as Q3, 3 as Q4 UNION ALL
SELECT 'Apple', 77, 0, 25, 2
)
SELECT * FROM Produce
UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))
2020更新:
fhoffa.x.unpivot()
参见:
我创建了一个公共持久 UDF。如果您有一个表
a
,您可以将整行交给 UDF 以使其取消透视:
SELECT geo_type, region, transportation_type, unpivotted
FROM `fh-bigquery.public_dump.applemobilitytrends_20200414` a
, UNNEST(fhoffa.x.unpivot(a, '_2020')) unpivotted
它会像这样转换表格:
进入这个
正如评论所提到的,我上面的解决方案并不能解决问题。
所以这是一种变体,同时我研究如何将所有内容整合为一个:
CREATE TEMP FUNCTION unpivot(x ANY TYPE) AS (
(
SELECT
ARRAY_AGG(STRUCT(
REGEXP_EXTRACT(y, '[^"]+') AS key
, REGEXP_EXTRACT(y, ':([0-9]+)') AS value
))
FROM UNNEST((
SELECT REGEXP_EXTRACT_ALL(json,'"[smlx][meaxl]'||r'[^:]+:\"?[^"]+?') arr
FROM (SELECT TO_JSON_STRING(x) json))) y
)
);
SELECT location, unpivotted.*
FROM `robotic-charmer-726.bl_test_data.reconfiguring_a_table` x
, UNNEST(unpivot(x)) unpivotted
之前的回答:
使用表的 UNION(在 BigQuery 中使用“,”),加上一些列别名:
SELECT Location, Size, Quantity
FROM (
SELECT Location, 'Small' as Size, Small as Quantity FROM [table]
), (
SELECT Location, 'Medium' as Size, Medium as Quantity FROM [table]
), (
SELECT Location, 'Large' as Size, Large as Quantity FROM [table]
)
@Felipe,我使用标准 SQL 尝试了此操作,但在查询的第一行出现错误:“列名位置在 [1:8] 处不明确”
我使用了适合我的替代查询:
SELECT Location, 'Small' as Size, Small as Quantity FROM `table`
UNION ALL
SELECT Location, 'Medium' as Size, Medium as Quantity FROM `table`
UNION ALL
SELECT Location, 'Large' as Size, Large as Quantity FROM `table`
我有一个使用
STRUCT
s、ARRAY
s 和 CROSS JOIN
+ UNNEST
的解决方案:
WITH
My_Table_Metrics_Data AS (
SELECT
...,
[
STRUCT('...' AS Metric, ... AS Data),
STRUCT('...' AS Metric, ... AS Data),
] AS Metrics_Data
FROM
`My_Dataset.My_Table`
WHERE
...
)
SELECT
...,
Metric_Data
FROM
My_Table_Metrics_Data
CROSS JOIN
UNNEST(My_Table_Metrics_Data.Metrics_Data) AS Metric_Data
完整解释和说明:https://yuhuisdatascienceblog.blogspot.com/2018/06/how-to-unpivot-table-in-bigquery.html
完成主要答案,作为使其动态化的解决方案,您可以将其与“.INFORMATION_SCHEMA.COLUMNS”结合起来。
由于存在限制,并且不能将声明的变量引入到 UNPIVOT 函数中,所以我最终通过 python 运行脚本。
import pandas as pd
import pandas_gbq as gbq
query = """
SELECT STRING_AGG(column_name) AS Text
FROM (SELECT column_name
FROM Project_Name.Dataset.INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'MY_TABLE')
#Columns you like to keep unpivot
WHERE column_name NOT IN ('XXX','YYY')"""
df = gbq.read_gbq(query,project_id="nh-cro-forecast")
query = """
CREATE OR REPLACE TABLE Project_Name.Dataset.MY_TABLE_UNPIVOT AS
SELECT *
FROM nh-cro-forecast.MBO.Mexico_2024_Q4
UNPIVOT(VALUE FOR Attribute IN (""" + df.values[0][0] + """));
SELECT 1"""
gbq.read_gbq(query,project_id="nh-cro-forecast")