如何在 Clickhouse 中将 sum 数组列按另一个数组列的元素进行分组

问题描述 投票:0回答:1

我有一个 Clickhouse 表,其中包含多个数组列,例如

timestamp  sensor_type                    priority                values
10:00:00   ['a','b','b','a','c','a','c']  [3, 2, 1, 5, 1, 2, 1]   [7, 4, 1, 12, 3, 9, 2]
10:01:00   ['c','e','g','e','g']          [2, 4, 1, 2, 4]         [23, 3, 5, 8, 6]
...

时间戳是唯一且单调递增的。记录值的传感器在每个时间戳处动态变化。我正在尝试按每个时间戳的

values
sensor_type
priority
数组进行分组和求和,因此预期的聚合列如下:

timestamp  sensor_type_sorted  sum_val_by_sensor_type  priority_sorted  sum_val_by_priority
10:00:00   ['a', 'b', 'c']     [28, 5, 5]              [1, 2, 3, 5]     [6, 13, 7, 12]
10:01:00   ['c', 'e', 'g']     [23, 11, 11]            [1, 2, 4]        [5, 31, 9]
...

如何实现这一目标?

sql clickhouse
1个回答
0
投票

首先将“传感器类型和值”列的值转换为行。然后计算总和,对“id, type”进行“group by”,然后再次将行转换为列。 查询1:

select id, collect_list(typ) as sensor_type_sorted, collect_list(val) as sum_val_by_sensor_type from (
select id, typ, sum(val) as val from (
select s.id, s_type.typ, s_value.val 
from sensor s
LATERAL VIEW POSEXPLODE(s.type) s_type as seqt, typ 
LATERAL VIEW POSEXPLODE(s.value) s_value as seqv, val 
where seqt=seqv) temp1
group by id, typ
order by id, typ) calcultd
group by id;

对“优先级和值”列执行相同操作以创建查询2。在 id 上连接 query1 和 query2。

© www.soinside.com 2019 - 2024. All rights reserved.