如何在 BigQuery 中取消透视?

问题描述 投票:0回答:5

不确定要调用什么函数,但转置是我能想到的最接近的东西。

我在 BigQuery 中有一个表,其配置如下: enter image description here

但是我想查询一个配置如下的表:

enter image description here

创建此表的 SQL 代码是什么样的?

谢谢!

sql google-bigquery transpose unpivot
5个回答
35
投票

2021 年更新:

BigQuery 中引入了新的 UNPIVOT 运算符。

在使用 UNPIVOT 将 Q1、Q2、Q3、Q4 旋转为销售额和季度列之前:

产品 Q1 Q2 Q3 Q4
羽衣甘蓝 51 23 45 3
苹果 77 0 25 2

使用 UNPIVOT 将 Q1、Q2、Q3、Q4 旋转为销售额和季度列后:

产品 销售 季度
羽衣甘蓝 51 Q1
羽衣甘蓝 23 Q2
羽衣甘蓝 45 Q3
羽衣甘蓝 3 Q4
苹果 77 Q1
苹果 0 Q2
苹果 25 Q3
苹果 2 Q4

查询:

WITH Produce AS (
  SELECT 'Kale' as product, 51 as Q1, 23 as Q2, 45 as Q3, 3 as Q4 UNION ALL
  SELECT 'Apple', 77, 0, 25, 2
)
SELECT * FROM Produce
UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))

15
投票

2020更新

fhoffa.x.unpivot()

参见:

我创建了一个公共持久 UDF。如果您有一个表

a
,您可以将整行交给 UDF 以使其取消透视:

SELECT geo_type, region, transportation_type, unpivotted
FROM `fh-bigquery.public_dump.applemobilitytrends_20200414` a
  , UNNEST(fhoffa.x.unpivot(a, '_2020')) unpivotted

它会像这样转换表格:

enter image description here

进入这个

enter image description here


正如评论所提到的,我上面的解决方案并不能解决问题。

所以这是一种变体,同时我研究如何将所有内容整合为一个:

CREATE TEMP FUNCTION unpivot(x ANY TYPE) AS (
(
  SELECT 
   ARRAY_AGG(STRUCT(
     REGEXP_EXTRACT(y, '[^"]+') AS key
   , REGEXP_EXTRACT(y, ':([0-9]+)') AS value
   ))
  FROM UNNEST((
    SELECT REGEXP_EXTRACT_ALL(json,'"[smlx][meaxl]'||r'[^:]+:\"?[^"]+?') arr
    FROM (SELECT TO_JSON_STRING(x) json))) y
)
);

SELECT location, unpivotted.*
FROM `robotic-charmer-726.bl_test_data.reconfiguring_a_table` x
  , UNNEST(unpivot(x)) unpivotted


之前的回答:

使用表的 UNION(在 BigQuery 中使用“,”),加上一些列别名:

SELECT Location, Size, Quantity
FROM (
  SELECT Location, 'Small' as Size, Small as Quantity FROM [table]
), (
  SELECT Location, 'Medium' as Size, Medium as Quantity FROM [table]
), (
  SELECT Location, 'Large' as Size, Large as Quantity FROM [table]
)

2
投票

@Felipe,我使用标准 SQL 尝试了此操作,但在查询的第一行出现错误:“列名位置在 [1:8] 处不明确”

我使用了适合我的替代查询:

SELECT Location, 'Small' as Size, Small as Quantity FROM `table`
UNION ALL
SELECT Location, 'Medium' as Size, Medium as Quantity FROM `table`
UNION ALL
SELECT Location, 'Large' as Size, Large as Quantity FROM `table`

1
投票

我有一个使用

STRUCT
s、
ARRAY
s 和
CROSS JOIN
+
UNNEST
的解决方案:

WITH
  My_Table_Metrics_Data AS (
  SELECT
    ...,
    [
        STRUCT('...' AS Metric, ... AS Data),
        STRUCT('...' AS Metric, ... AS Data),
    ] AS Metrics_Data
  FROM
    `My_Dataset.My_Table`
  WHERE
    ...
  )
SELECT
  ...,
  Metric_Data
FROM
  My_Table_Metrics_Data
CROSS JOIN
  UNNEST(My_Table_Metrics_Data.Metrics_Data) AS Metric_Data

完整解释和说明:https://yuhuisdatascienceblog.blogspot.com/2018/06/how-to-unpivot-table-in-bigquery.html


0
投票

完成主要答案,作为使其动态化的解决方案,您可以将其与“.INFORMATION_SCHEMA.COLUMNS”结合起来。

由于存在限制,并且不能将声明的变量引入到 UNPIVOT 函数中,所以我最终通过 python 运行脚本。

import pandas as pd
import pandas_gbq as gbq

query = """
SELECT STRING_AGG(column_name) AS Text
  FROM (SELECT column_name 
    FROM Project_Name.Dataset.INFORMATION_SCHEMA.COLUMNS
    WHERE table_name = 'MY_TABLE') 
  #Columns you like to keep unpivot
  WHERE column_name NOT IN  ('XXX','YYY')"""

df = gbq.read_gbq(query,project_id="nh-cro-forecast")

query = """
CREATE OR REPLACE TABLE Project_Name.Dataset.MY_TABLE_UNPIVOT AS
SELECT *
FROM nh-cro-forecast.MBO.Mexico_2024_Q4
UNPIVOT(VALUE FOR Attribute IN (""" + df.values[0][0] + """));
SELECT 1"""

gbq.read_gbq(query,project_id="nh-cro-forecast")
© www.soinside.com 2019 - 2024. All rights reserved.