尝试将 json 数据作为多行文本嵌入到 json 数据列表中的键中,并将该列表保存为缩进良好的 json 文件

问题描述 投票:0回答:2

我有一个名为 mylist 的 json 数据列表,其中包含网站的数据。我想将站点的网络日志作为值附加到键值对的列表中。由于网络日志是一个大数据,我想将其保存为多行文本而不是单行长字符串。 stackoverflow 容器中的输出将“networkLogFilePath”键的值显示为多行文本,但 json 文件的情况并非如此。我想用 json 文件达到同样的效果。

我的代码:

  import re
  import json

  with open("dummydict1.json", "w") as 
 f:
  dummyjson = [
            {
                "rank":1,
                
  "bottom_ad_title":
  "Onlinecarparts.co.uk 
  Coupon | New 50% Coupon Code",
                
  "bottom_ad_desc":"Redeem 50% 
 Onlinecarparts.co.uk Coupon Now. 
 Latest Verified Coupons & Deals! 50% 
 Onlinecarparts.co.uk Coupon Limited 
 Time Offer. Onlinecarparts.co.uk 
 Coupons Save Now. Top Discounts. Save 
 Today.", 
                                               
 "bottom_ad_link":
 "https://www.couponscored.com"
            },
            {
                "rank":2,
                "bottom_ad_title":
 "Onlinecarparts.co.uk Coupon | New 
 50% Coupon Code",
                
 "bottom_ad_desc":"Redeem 50% 
 Onlinecarparts.co.uk Coupon Now. 
 Latest Verified Coupons & Deals! 50% 
 Onlinecarparts.co.uk Coupon Limited 
 Time Offer. Onlinecarparts.co.uk 
 Coupons Save Now. Top Discounts. Save 
 Today.",
                
 "bottom_ad_link":
 "https://www.couponscored.com"
            }
        ]

# better_json = re.sub(r'^((\s*)".*?":)\s*([\[{])', r'\1\n\2\3', json.dumps(dummyjson, indent='\t'), flags=re.MULTILINE)
print(better_json)

mylist = [{'result': {"linkText": "https://www.skechers.com/shoe-finder/",
"link": "https://www.skechers.com",
"price": "",
"affiliate": "",
"byAffiliate": "",
"hostName": "www.skechers.com",
"IsCompetitorSite":0}, 
"networkLogFilePath":
json.dumps(dummyjson)}]

# "[" + ",\n,".join([json.dumps(d) for d in dummyjson]) + "]"}] # didn't work

# json.dumps(dummyjson, indent=1, separators=(',', ': ')) # didn't work

# pprint.pprint(mylist) # log printed and indented but as of pieces of cropped strings

print(json.dumps(mylist, indent=2))
f.write(json.dumps(mylist, indent=2))

输出:

  [
{
"result": {
  "linkText": 
"https://www.skechers.com/shoe- 
finder/",
  "link": "https://www.skechers.com",
  "price": "",
  "affiliate": "",
  "byAffiliate": "",
  "hostName": "www.skechers.com",
  "IsCompetitorSite": 0
},
"networkLogFilePath": "[{\"rank\": 1, 
\"bottom_ad_title\": 
\"Onlinecarparts.co.uk Coupon | New 
50% Coupon Code\", \"bottom_ad_desc\": 
\"Redeem 50% Onlinecarparts.co.uk 
Coupon Now. Latest Verified coupons & 
Deals! 50% Onlinecarparts.co.uk Coupon 
Limited Time Offer. 
Onlinecarparts.co.uk Coupons Save Now. 
Top Discounts. Save Today.\", 
\"bottom_ad_link\": 
\"https://www.couponscored.com\"}, 
{\"rank\": 2, \"bottom_ad_title\": 
\"Onlinecarparts.co.uk Coupon | New 
50% Coupon Code\", \"bottom_ad_desc\": 
\"Redeem 50% Onlinecarparts.co.uk 
Coupon Now. Latest Verified Coupons & 
Deals! 50% Onlinecarparts.co.uk Coupon 
Limited Time Offer. 
Onlinecarparts.co.uk Coupons Save Now. 
Top Discounts. Save Today.\", 
\"bottom_ad_link\": 
\"https://www.couponscored.com\"}]"
 }
]

输出是一个缩进的 json,但键“networkLogFilePath”的值是单行长字符串(以 MB 为单位)。这里的容器将其显示为多行文本,但它与 .json 文件不同。我在 json.dumps、re.MULTILINE、pprint.pprint 中尝试了 indent = 2 和“”,并将 networkLogFilePath 中的值与“连接” “但他们没有用。

任何人都可以帮助我实现将 networkLogFilePath 键打印为多行字符串的值,如 stackoverflow 容器中所示?

json python-3.x selenium-webdriver web-scraping indentation
2个回答
0
投票

VScode 中有一个“自动换行”选项,可以将大型 json 数据换行到下一行。但我收到的文件已专门为其预先启用了该设置。其长字符串显示为换行,并且可以使用 Alt+Z 轻松打开和关闭其换行设置。从上面的代码中新创建的 json 文件没有设置,并且 Alt+Z 不起作用。


0
投票

好吧,有一个“强制启用功能”的选项,例如自动换行等,但已被禁用。现在,文件太大而无法查看。 问题解决了。

© www.soinside.com 2019 - 2024. All rights reserved.