将嵌入的json字符串转换为pandas数据帧

问题描述 投票:-1回答:1

我有以下数据:

json_str = "[{“key1”: “value1”, “key2”= “value2”, 
“key3”: “{“key_a”: “value_a1”, “key_b”: “value_b1”, “key_c”: “value_c1”}”,“key4”: 4},
{“key1”: “value5”, “key2”= “value6”, 
“key3”: “{“key_a”: “value_a2”, “key_b”: “value_b2”, “key_c”: “value_c2”}”,“key4”: 8}]"

我想将其转换为pandas DataFrame。我试过这个:

#code1
data = pd.read_json(json_str)
print(data)

#code2
data = pd.read_json(json_str, typ ='series')
print(data)

#code3
data = pd.DataFrame.from_dict([json_str], orient='columns', dtype= None)
print(data)

#same output
ValueError: Unexpected character found when decoding object value

再次:

data = json.loads(json_str)
print(data)
enter code here
error : json.decoder.JSONDecodeError: Expecting ',' delimiter

我无法使用.replace(),因为我需要一个列名“key3”,其中包含例如:{“key_a”:“value_a1”,“key_b”:“value_b1”,“key_c”:“value_c1”}的JSON值

python json pandas
1个回答
0
投票

需要清理数据,这是一种方法

from functools import reduce
import ast

di = {'“':"'", '”':"'", "'{":'{', "}'":"}", "=":':' }

new = reduce(lambda x, y: x.replace(y, di[y]), di, json)

df = pd.io.json.json_normalize(ast.literal_eval(new))

print(df)

     key1    key2 key3.key_a key3.key_b key3.key_c  key4
0  value1  value2   value_a1   value_b1   value_c1     4
1  value5  value6   value_a2   value_b2   value_c2     8
© www.soinside.com 2019 - 2024. All rights reserved.