概括将 .json 文件转换为 .csv 的情况

问题描述 投票:0回答:1

我有以下格式的 .json 文件:

{
    "uls":{
        "equ1-L1-u": {"D": 1.10, "La": 1.50, "Lb": 1.50},
        "equ1-L2-u": {"D": 1.10, "La": 1.50, "Lb": 1.50},
    },
    "sls":{
        "cha-L1": {"Ld": 1.00, "Le": 1.00, "Lf": 1.00, "Lg": 1.00, "Lh": 1.00},
        "cha-L2": {"D": 1.00, "Df": 1.00},
    }
}

我想将其转换为 CSV“数据库”样式格式:

Criteria,  Name,         D,    Df,   La,    Lb,     Ld,   Le,   Lf,   Lg,   Lh  
uls,       equ1-L1-u,    1.10,     , 1.50,  1.50,     ,     ,     ,     ,    
uls,       equ1-L2-u,    1.10,     , 1.50,  1.50,     ,     ,     ,     ,    
sls,       cha-L1,           ,     ,     ,      ,1.00 , 1.00, 1.00, 1.00, 1.00      
sls,       cha-L2,       1.00, 1.00,     ,      ,     ,     ,     ,     ,

理想情况下,我不必事先定义值键,但现在如果需要的话我可以这样做。

这就是我现在得到的,它适用于 2 层嵌套的特定情况。我可以通过修改代码使其适用于 3 或 4 级嵌套,但理想情况下相同的代码可以用于所有级别的嵌套(也许是递归?)。

# Function to convert json objects to csv
import json
import csv

def make_csv_dict(data, key_headers):
    csv_dict = []
    for i in data:
        for j in data[i]:
            csv_dict.append({
                key_headers[0]: i,
                key_headers[1]: j,
                **data[i][j]
                })
    return csv_dict

### ENTER DATA HERE ###
key_headers = ["Criteria", "Name"]
path = "File.json"
### ENTER DATA HERE ###

# Read json
with open(path) as json_file:
    data = json.load(json_file)

# make csv_dict from .json data
csv_dict = make_csv_dict(data, key_headers)

# writing to csv file
fieldnames = ["Criteria", "Name", "D", "Df", "La", "Lb", "Lc", "Ld", "Le", "Lf", "Lg", "Lh", "Sl", "Sh", "W", "T", "A", "E"]

with open(path.replace(".json",".csv"), 'w', newline="") as f:
    writer = csv.DictWriter(f, fieldnames)
    writer.writeheader()
    writer.writerows(csv_dict)
python json csv nested
1个回答
0
投票

您可以使用

pd.json_normalize
pivot_table()
来实现此目的,并进行一些额外的处理。

首次使用

json_normalize
:

import pandas as pd

data = {
    "uls":{
        "equ1-L1-u": {"D": 1.10, "La": 1.50, "Lb": 1.50},
        "equ1-L2-u": {"D": 1.10, "La": 1.50, "Lb": 1.50},
    },
    "sls":{
        "cha-L1": {"Ld": 1.00, "Le": 1.00, "Lf": 1.00, "Lg": 1.00, "Lh": 1.00},
        "cha-L2": {"D": 1.00, "Df": 1.00},
    }
}

df = pd.json_normalize(data, sep='_')
print(df)

 uls_equ1-L1-u_D  uls_equ1-L1-u_La  ...  sls_cha-L2_D  sls_cha-L2_Df
0              1.1               1.5  ...           1.0            1.0

接下来,您要转置数据框:

df = df.T.reset_index().rename(columns = {0: 'Values'})
print(df)

            index  Values
0    uls_equ1-L1-u_D     1.1
1   uls_equ1-L1-u_La     1.5
2   uls_equ1-L1-u_Lb     1.5
3    uls_equ1-L2-u_D     1.1
4   uls_equ1-L2-u_La     1.5
5   uls_equ1-L2-u_Lb     1.5
6      sls_cha-L1_Ld     1.0
7      sls_cha-L1_Le     1.0
8      sls_cha-L1_Lf     1.0
9      sls_cha-L1_Lg     1.0
10     sls_cha-L1_Lh     1.0
11      sls_cha-L2_D     1.0
12     sls_cha-L2_Df     1.0

现在我们可以使用

json_normalize
函数中定义的分隔符将索引列拆分为多个列:

df[['Criteria', 'Name', 'SubName']] = df['index'].str.split('_', expand=True)
print(df)

                 index  Values Criteria   Name     SubName
0    uls_equ1-L1-u_D     1.1      uls  equ1-L1-u       D
1   uls_equ1-L1-u_La     1.5      uls  equ1-L1-u      La
2   uls_equ1-L1-u_Lb     1.5      uls  equ1-L1-u      Lb
3    uls_equ1-L2-u_D     1.1      uls  equ1-L2-u       D
4   uls_equ1-L2-u_La     1.5      uls  equ1-L2-u      La
5   uls_equ1-L2-u_Lb     1.5      uls  equ1-L2-u      Lb
6      sls_cha-L1_Ld     1.0      sls     cha-L1      Ld
7      sls_cha-L1_Le     1.0      sls     cha-L1      Le
8      sls_cha-L1_Lf     1.0      sls     cha-L1      Lf
9      sls_cha-L1_Lg     1.0      sls     cha-L1      Lg
10     sls_cha-L1_Lh     1.0      sls     cha-L1      Lh
11      sls_cha-L2_D     1.0      sls     cha-L2       D
12     sls_cha-L2_Df     1.0      sls     cha-L2      Df

最后,我们需要旋转数据框,重命名索引,并填充 NaN 值:

pivot_df = df.pivot_table(index=['Criteria', 'Name'], columns='SubName', values='Values', aggfunc='first').reset_index().rename_axis(None, axis=1).fillna('')
print(pivot_df)

Criteria       Name    D   Df   La   Lb   Ld   Le   Lf   Lg   Lh
0      sls     cha-L1                      1.0  1.0  1.0  1.0  1.0
1      sls     cha-L2  1.0  1.0
2      uls  equ1-L1-u  1.1       1.5  1.5
3      uls  equ1-L2-u  1.1       1.5  1.5

总而言之就是:

import pandas as pd

data = {
    "uls":{
        "equ1-L1-u": {"D": 1.10, "La": 1.50, "Lb": 1.50},
        "equ1-L2-u": {"D": 1.10, "La": 1.50, "Lb": 1.50},
    },
    "sls":{
        "cha-L1": {"Ld": 1.00, "Le": 1.00, "Lf": 1.00, "Lg": 1.00, "Lh": 1.00},
        "cha-L2": {"D": 1.00, "Df": 1.00},
    }
}

df = pd.json_normalize(data, sep='_')

df = df.T.reset_index().rename(columns = {0: 'Values'})

df[['Criteria', 'Name', 'SubName']] = df['index'].str.split('_', expand=True)

pivot_df = df.pivot_table(index=['Criteria', 'Name'], columns='SubName', values='Values', aggfunc='first').reset_index().rename_axis(None, axis=1).fillna('')

© www.soinside.com 2019 - 2024. All rights reserved.