我有一个 CSV 文件,其中每一行都是一个已格式化为 JSON 的字符串。某些元素嵌套在给定键内。这是一个简化的示例行:
{"key1":"string_value","key2":["string_value"],"key4":integer,"key5":[{"nested_key1":"string_value","nested_key2":boolean}],"key6":integer}
我需要将每一行写入到自己的 JSON 文件中,名为
row_#.json
。
我最接近的是这段代码:
import csv
csv_file = 'path/to/my/file.csv'
with open(csv_file, mode='r') as infile:
for i, line in enumerate(infile):
with open(f"row_{i}.json", "w") as outfile:
outfile.write(line)
输出的文件显示每个键和每个字符串值,用一组附加的双引号括起来。整个文件内容也用引号引起来。使用上面的例子,我最终得到:
"{""key1"":""string_value"",""key2"":[""string_value""],""key4"":integer,""key5"":[{""nested_key1"":""string_value"",""nested_key2"":boolean}],""key6"":integer}"
如何在没有这种额外格式的情况下输出原始行内容?我可以在不读取输出的 JSON 文件并尝试替换字符串的情况下执行此操作吗?请注意,在将文件读入 Python 之前,我无法返回源并将其格式设置为正确的 CSV。
您遇到的问题的发生是因为 csv.reader 或直接读取文件作为字符串将每一行用双引号括起来,并在写回文件时转义内部引号。要解决此问题,请将每一行视为原始字符串并跳过任何不必要的解析或引用。
# Path to your CSV file
csv_file = 'path/to/my/file.csv'
# Open the CSV file
with open(csv_file, mode='r') as infile:
# Use enumerate to keep track of row numbers
for i, line in enumerate(infile):
# Strip any leading/trailing whitespace or newline characters
json_content = line.strip()
# Write each row as its own JSON file
with open(f"row_{i}.json", "w") as outfile:
# Write the raw JSON string without adding additional quotes
outfile.write(json_content)