解析键值对，根据值进行选择，并使用 SQL 创建新列

Question

我有一个像这样的 my_table 。 Col2 是字符串类型：

| Col1         | Col2                   |
|--------------|------------------------|
| some_string  |{key1:0, key2:1, key3:0}|
| some_string  |{key1:0, key2:0, key3:0}|
| some_string  |{key1:1, key2:2, key3:3}|
| some_string  |{key1:1, key2:1, key3:0}|

我想使用SQL来解析Col2上的键：值对，并过滤掉值为“0”的对，以便输出为：

| Col1         | Col2                   |
|--------------|------------------------|
| some_string  |{key2:1}                |
| some_string  |{}                      |
| some_string  |{key1:1, key2:2, key3:3}|
| some_string  |{key1:1, key2:1}        |

有人帮忙吗？

Answer 1

这是您可以使用

Redshift

SQL 采取的方法：

使用字符串函数将 Col2 拆分为其组件键值对。
过滤掉值为“0”的对。
以类似 JSON 的格式重建字符串。

这有点复杂，因为 Redshift 不支持像较新的 SQL 数据库那样轻松地将动态字符串解析为列或行。通常，您会在应用程序层处理此类数据或使用具有 JSON 处理功能的数据库。

SELECT Col1,
       '{' || LISTAGG(Case When part_value != '0' Then part_key || ':' || part_value End, ',') WITHIN GROUP (ORDER BY part_key) || '}' AS Col2
FROM (
    SELECT Col1,
           SPLIT_PART(SPLIT_PART(Col2, ',', n.n), ':', 1) AS part_key,
           SPLIT_PART(SPLIT_PART(Col2, ',', n.n), ':', 2) AS part_value
    FROM my_table
    CROSS JOIN (SELECT 1 n UNION SELECT 2 UNION SELECT 3) n 
    WHERE Col2 LIKE '%:%' 
) sub
GROUP BY Col1;

Answer 2

那么，让我们创建架构：

create table mytable(col1 text, col2 text);

insert into mytable(col1, col2)
values
('some_string', '{key1:0, key2:1, key3:0}'),
('some_string', '{key1:0, key2:0, key3:0}'),
('some_string', '{key1:1, key2:2, key3:3}'),
('some_string', '{key1:1, key2:1, key3:0}');

为了计算您想要的值，请使用

select col2 as input, regexp_replace(col2, '(, )?key\d:0(, )?', '', 'g') as output
from mytable;

如果您想选择它们。

regexp_replace

将模式 (param2) 的字符串 (param1) 中的匹配替换为新字符串 (param3)。

指定所有匹配项都将被替换。

小提琴：

https://www.db-fiddle.com/f/omvF3XLrmCpYM79YYTA1wH/0

如果您需要更新值，那么您可以：

update mytable
set col2 = regexp_replace(col2, '(, )?key\d:0(, )?', '', 'g');

小提琴：https://www.db-fiddle.com/f/omvF3XLrmCpYM79YYTA1wH/1

Answer 3

由于 Redshift 是基于 Postgres 的数据仓库，因此该查询可能适合您。

首先，我们将 Col2 拆分为键值对，然后过滤掉值为 0 的元素，最后，我们重建 Col2 :

WITH cte AS (
   SELECT Col1, Col2,
       UNNEST(STRING_TO_ARRAY(TRANSLATE(Col2, '{ }', ''), ',')) AS Col2_splitted
   FROM my_table
)
SELECT Col1, Col2, '{' || STRING_AGG(col2_splitted, ', ') || '}' AS Col2_filtered
FROM cte
WHERE split_part(col2_splitted, ':', 2) <> '0'
GROUP BY Col1, Col2
UNION ALL
SELECT Col1, Col2, '{}'
FROM cte
GROUP BY Col1, Col2
HAVING COUNT(CASE WHEN split_part(col2_splitted, ':', 2) <> '0' THEN 1 END) = 0;

结果：

col1        col2                        col2_filtered
some_string {key1:0, key2:1, key3:0}    {key2:1}
some_string {key1:1, key2:2, key3:3}    {key1:1, key2:2, key3:3}
some_string {key1:1, key2:1, key3:0}    {key1:1, key2:1}
some_string {key1:0, key2:0, key3:0}    {}

PostgreSql 演示

解析键值对，根据值进行选择，并使用 SQL 创建新列

问题描述投票：0回答：3

3个回答

最新问题

解析键值对，根据值进行选择，并使用 SQL 创建新列

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3