我有一些来自 json 文件的深层嵌套数据,我正在尝试将其加载到 DuckDB 中:
"events": [
{
"id": "401638586",
"uid": "s:40~l:41~e:401638586",
"date": "2024-03-21T18:45Z",
"name": "Wagner Seahawks at North Carolina Tar Heels",
"shortName": "WAG VS UNC",
"season": {
"year": 2024,
"type": 3,
"slug": "post-season"
},
"competitions": [
{
"id": "401638586",
"uid": "s:40~l:41~e:401638586~c:401638586",
"date": "2024-03-21T18:45Z",
"attendance": 18223,
"type": {
"id": "6",
"abbreviation": "TRNMNT"
},
"timeValid": true,
"neutralSite": true,
"conferenceCompetition": false,
"playByPlayAvailable": true,
"recent": false,
.
.
.
我使用的查询如下所示:
select
games['id'] as id,
games['date'] as date,
games['season']['year'] as season_year,
games['season']['slug'] as season_slug,
'2024-03-21' as partition_date,
games['name'] as name,
games['shortName'] as short_name,
games['status']['period'] as period,
games['status']['type']['completed'] as completed,
games['competitions'][0]['neutralSite'] as neutral,
games['competitions'][0]['conferenceCompetition'] as in_conference,
games['competitions'][0]['playByPlayAvailable'] as pbp_available
from (
select
unnest(events) as games
from read_json('/path/to/json/data')
)
limit 1;
输出如下所示:
id = 401638586
date = 2024-03-21T18:45Z
season_year = 2024
season_slug = post-season
partition_date = 2024-03-21
name = Wagner Seahawks at North Carolina Tar Heels
short_name = WAG VS UNC
period = 2
completed = true
neutral =
in_conference =
pbp_available =
您可以看到,一旦进入嵌套的“competitions”字段,它只会返回空/空值。我怎样才能正确访问这些字段?我尝试过使用
json_extract
但似乎无法让它工作。
您正在使用
[0]
而不是 [1]
请参阅文档中的警告:
使用遵循 PostgreSQL 的约定,DuckDB 对数组和列表使用基于 1 的索引,对 JSON 数据类型使用基于 0 的索引。
[1]
将产生预期值。
games['competitions'][1]['neutralSite'] as neutral,
games['competitions'][1]['conferenceCompetition'] as in_conference,
games['competitions'][1]['playByPlayAvailable'] as pbp_available
Rows: 1
Columns: 10
$ id <str> '401638586'
$ date <str> '2024-03-21T18:45Z'
$ season_year <i64> 2024
$ season_slug <str> 'post-season'
$ partition_date <str> '2024-03-21'
$ name <str> 'Wagner Seahawks at North Carolina Tar Heels'
$ short_name <str> 'WAG VS UNC'
$ neutral <bool> True
$ in_conference <bool> False
$ pbp_available <bool> True