例如对于一个字段中的数据:
"{""noAbsolutValues"":{""HIGHLIGHTS"":[""engineData_startStopSystem"",""search_parkingAssistants"",""heatingCooling_climatisation"",""multimedia_navigationSystem"",""search_seatHeating"",""wheel_multifunctionalWheel""],""CLIMATISATION"":[""selector_climatisation_airCondition""],""MULTIMEDIA"":[""multimedia_navigationSystem"",""multimedia_usbInterface"",""multimedia_radioTuner""],""HEATER"":[""selector_coDriverSeats_electricHeated"",""selector_driverSeats_electricHeated""],""ASSISTANTS"":[""assistants_parkingSensors"",""assistants_cruiseControl""]},""dateValues"":{}}"
起初,我使用以下查询分隔了所需的值:
SELECT
ARRAY_TO_STRING(REGEXP_EXTRACT_ALL(filter_query,r'"[[:alpha:]]+_[[:alpha:]]+_[[:alpha:]]+"|"[[:alpha:]]+_[[:alpha:]]+"'),";") as fq
FROM `table`
我的结果是不同类型的行,其值用分号分隔。
例如一行:
"engineData_startStopSystem";"search_parkingAssistants";"heatingCooling_climatisation";"multimedia_navigationSystem";"search_seatHeating";"wheel_multifunctionalWheel";"selector_climatisation_airCondition";"multimedia_navigationSystem";"multimedia_usbInterface";"multimedia_radioTuner";"selector_coDriverSeats_electricHeated";"selector_driverSeats_electricHeated";"assistants_parkingSensors";"assistants_cruiseControl"
现在我必须计算所有变体,完美的是,我将有一行用于值,而一行用于计数结果。
非常感谢您的帮助
我相信此线程中提供的示例将为您提供帮助。Key, Value Count in BigQuery
测试数据:
{“ fil”:{“ property”:{“ id”:{id_1:“ a”,id_2:“ b”,id_3:“ c”,id_4:“ d”}}}}}}]}
查询:
WITH MyTable AS (
SELECT STRUCT(STRUCT(ARRAY<STRUCT<key STRING, value STRING>>[('id_1', 'a'), ('id_2', 'b'), ('id_3', 'c'), ('id_4', 'd')] AS id) AS property) AS fil
UNION ALL SELECT STRUCT(STRUCT(ARRAY<STRUCT<key STRING, value STRING>>[('id_1', 'b'), ('id_3', 'e')] AS id) AS property) AS fil
UNION ALL SELECT STRUCT(STRUCT(ARRAY<STRUCT<key STRING, value STRING>>[] AS id) AS property) AS fil
UNION ALL SELECT STRUCT(STRUCT(ARRAY<STRUCT<key STRING, value STRING>>[('id_4', 'a'), ('id_2', 'c')] AS id) AS property) AS fil)
SELECT
COUNT(DISTINCT id.key) AS num_keys,
COUNT(DISTINCT id.value) AS num_values
FROM MyTable t, t.fil.property.id AS id;
输出:
+----------+------------+
| num_keys | num_values |
+----------+------------+
| 4 | 5 |
+----------+------------+