保存在 Json 行时如何避免跳过双引号？

Question

我有

jsonl

格式的文件，如下所示：

input.jsonl

{"file_name": "input_image.jpg", "text": "II. Firtos. Jelige: \"Vándor daruid V betűje szállt.\""}

我正在使用熊猫阅读这个文件：

def load_jsonl1():
    return pd.read_json(
                        path_or_buf = 'input.jsonl',
                        lines=True,
                        )
df1= load_jsonl1()
df1.head(1)

在显示第一行后我们看到缺少

：

II. Firtos. Jelige: "Vándor daruid V betűje szállt."

我写了以下内容：

new_df = pd.DataFrame(columns=['file_name', 'text'])
output_path =  'output.jsonl'
# create output.jsonl file
with open(output_path, 'w') as outfile:
  # to add taqd to see progress
  for idx in tqdm(range(len(df1))):
    image_path = os.path.join(path, df['file_name'][idx])
    image = Image.open(image_path).convert("RGB")
    for i in range(sample_amount):
      augmented_image = augment_img(image)
      augmented_image_path = os.path.join(aug_imgs, 'aug_' + str(idx * sample_amount + i) + '.jpg')
      augmented_image.save(augmented_image_path)
      # write updated label to output.jsonl file
      outfile.write('{"file_name": "' + os.path.basename(augmented_image_path) + '", "text": "' + df['text'][idx] + '"}\n')

我想保留所有带有所有符号的文本，因为它在

input.jsonl

中以备后用所以我在用

pandas

读取文件并保存在与原始文件相同的

ouput.jsonl

文件中后我正在期待什么

{"file_name": "input_image.jpg", "text": "II. Firtos. Jelige: \"Vándor daruid V betűje szállt.\""}

保存在 Json 行时如何避免跳过双引号？

问题描述投票：0回答：0

最新问题

保存在 Json 行时如何避免跳过双引号？

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0