给定多个任意嵌套 json 文件的单行字符串,不带分隔符,例如:
contents = r'{"payload":{"device":{"serial":213}}}{"payload":{"device":{"serial":123}}}'
如何将
contents
解析为 dicts/json 数组?我试过了
df = pd.read_json(contents, lines=True)
但只得到了 ValueError 响应:
ValueError: Unexpected character found when decoding array value (2)
这里已经回答了:https://stackoverflow.com/a/54666028/693869
这是一个可以工作的生成器的示例。 我添加了一些评论字符串,这会导致接受的答案被破坏。
import json
from typing import Iterator
contents = r'{"payload":{"device":{"serial":213}},"comment":"spoiler:|hello|"}{"payload":{"device":{"serial":123}},"comment":"Hey look at my strange face: }{"}'
def parse_payloads(s: str) -> Iterator[int]:
decoder = json.JSONDecoder()
end = 0
while end < len(s):
item, end = decoder.raw_decode(s, end)
print(item)
yield item
json_dicts = list(parse_payloads(contents))
print(json_dicts)
您可以拆分字符串,然后将每个 JSON 字符串解析为字典:
import json
contents = r'{"payload":{"device":{"serial":213}}}{"payload":{"device":{"serial":123}}}'
json_strings = contents.replace('}{', '}|{').split('|')
json_dicts = [json.loads(string) for string in json_strings]
输出:
[{'payload': {'device': {'serial': 213}}}, {'payload': {'device': {'serial': 123}}}]