红移 | JSON 扁平化并将其存储在表中

Question

我有一个 JSON 值

{
    "test": {        
        "userId": 77777,
        
        "sectionScores": [
            {
                "id": 2,
                "score": 244,
             
            },
            {
                "id": 1,
                "score": 212
                
            }
        ]
    }
}

注意：sectionScores 的顺序有所不同。

id 1 代表sectionname1，id 2 代表sectionname2，id 3 代表节名3

最多可返回 4 个部分。它还可能返回 2 个部分。

我有一个红移表

create table stage.poc1(
user_id bigint ,
json_data super
)

从我编写扁平查询并插入到目标表的地方

select user_id,json_data.test[0] ...我不知道怎么写

create table target.poc
{
user_id bigint,
sectionname1_score int,
sectionname2_score int,
sectionname3_score int,
sectionname4_score int
}

insert into target.poc

select user_id, json_data.test[0].score as sectionname1_score

...我不知道如何动态放置值

提前致谢

Answer 1

做了一些假设，但这应该可以帮助您开始。

首先请注意，在使用具有大写字母的超级时，您需要在 Redshift 中启用区分大小写，并对任何具有大写字母的名称使用双引号。

最简单的方法是直接将 json 字符串转换为 super. 然后将数组取消嵌套到其他行中 - 请参阅：https://docs.aws.amazon.com/redshift/latest/dg/query-super.html

下面我有一个示例 SQL，它将在 CTE（with 子句）中创建一些测试数据，然后取消嵌套并将值拆分到您想要的列中。然后 Group by 和 SUM() 将行聚合到所需的结果。

SET enable_case_sensitive_identifier TO true;

with data as (
select
  json_parse('{
    "test": {        
        "userId": 55555,        
        "sectionScores": [
            {
                "id": 2,
                "score": 544
            },
            {
                "id": 1,
                "score": 512
                
            },
            {
                "id": 3,
                "score": 544
             
            },
            {
                "id": 4,
                "score": 412
                
            }
        ]
    }
}') sp
union all
select
  json_parse('{
    "test": {        
        "userId": 77777,
        "sectionScores": [
            {
                "id": 2,
                "score": 244
             
            },
            {
                "id": 1,
                "score": 212
                
            }
        ]
    }
}') sp
union all
select
  json_parse('{
    "test": {        
        "userId": 66666,
        "sectionScores": [
            {
                "id": 2,
                "score": 644             
            },
            {
                "id": 1,
                "score": 612
                
            },
            {
                "id": 3,
                "score": 612
                
            }
        ]
    }
}') sp
)
select d.sp.test."userId" as user_id, 
  sum(decode(ss.id,1,ss.score)) as sectionname1_score,
  sum(decode(ss.id,2,ss.score)) as sectionname2_score,
  sum(decode(ss.id,3,ss.score)) as sectionname3_score,
  sum(decode(ss.id,4,ss.score)) as sectionname4_score
from data d, d.sp.test."sectionScores" ss
group by user_id
order by user_id;

您所拥有的确切表格定义并不完全清楚，但从这里您应该能够使其看起来像您需要的那样。

红移 | JSON 扁平化并将其存储在表中

问题描述投票：0回答：1

1个回答

最新问题

红移 | JSON 扁平化并将其存储在表中

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1