如何转换 使用内置运算符json.dumps在python中使用JSON

问题描述 投票:0回答:2

我如何转换为json格式,

我收到错误“不是JSON序列化”

以下是我的计划

from urllib2 import urlopen as uReq
import re
from bs4 import BeautifulSoup, Comment
import requests
import json
my_url='https://uae.dubizzle.com/en/property-for-rent/residential/apartmentflat/?filters=(neighborhoods.ids=123)&page=1'

uClient=uReq(my_url)
page_html= uClient.read()
page_soup=BeautifulSoup(page_html, 'html.parser')
comments = page_soup.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
json_output= page_soup.find_all("script",type="application/ld+json",string=re.compile("SingleFamilyResidence")) #find_all("script", "application/ld+json")
#comments = json_output.findAll(text=lambda text:isinstance(text, Comment))
#[comment.extract() for comment in comments]
#json_output.find_all(text="<script type=""application/ld+json"">").replaceWith("")
#print json_output
jsonD = json.dumps(json_output)
uClient.close()

[{“@ context”:“http://schema.org”,“@ type”:“SingleFamilyResidence”,“name”:“宽敞的2BHK出租在大马士革街Al Qusais”,“url”:“https://dubai.dubizzle.com/property-for-rent/residential/apartmentflat/2018/4/29/spacious-two-bed-room-available-for-rent-i-2/”,“地址”:{“ @type“:”PostalAddress“,”addressLocality“:”Dubai“,”addressRegion“:”Dubai“},”“:{”@ type“:”产品“,”名称“:”宽敞的2BHK出租在大马士革街Al Qusais“,”url“:”https://dubai.dubizzle.com/property-for-rent/residential/apartmentflat/2018/4/29/spacious-two-bed-room-available-for-rent-i-2/“,”offer“:{”@ type“:”Offer“,”price“:49000,”priceCurrency“:”AED“}},”floorSize“:1400,”numberOfRooms “:2,”image“:”https://dbzlpvfeeds-a.akamaihd.net/images/user_images/2018/04/29/80881784_CP_photo.jpeg“,”geo“:{”@ type“:”GeoCoordinates“,”latitude“:55.3923,”longitude“:25.2893}},{”@ context“:”http://schema.org“, “@type”:“SingleFamilyResidence”,“name”:“家具齐全2床房 - Al Qusais”,“url”:“https://dubai.dubizzle.com/property-for-rent/residential/apartmentflat/2017/10/9/fully-furnished-brand-new-2-bed-room-flat--2/”,“地址”:{“@ type”:“PostalAddress”,“addressLocality” :“迪拜”,“地址”:“迪拜”},“”:{“@ type”:“产品”,“名称”:“家具齐全2床房 - Al Qusais”,“url”:“https://dubai.dubizzle.com/property-for-rent/residential/apartmentflat/2017/10/9/fully-furnished-brand-new-2-bed-room-flat--2/” , “报价”:{ “@类型”: “报价”, “价格”:70000 “priceCurrency”: “AED”}}, “floorSize”:1400, “numberOfRooms”:2 “图像”: “https://dbzlpvfeeds-a.akamaihd.net/images/user_images/2018/09/05/84371522_CP_photo.jpeg” , “地理位置”:{ “@类型”: “地理坐标”, “纬度”:55.3959,“经度“:25.2959}}]

json python-2.7 web-scraping beautifulsoup
2个回答
0
投票

嗨添加了BeautifulSoup的另一个包装,得到了预期的json

首先获取文本并使用.get_text()方法,然后使用json.loads

谢谢知识分子。

from urllib2 import urlopen as uReq
import re
from bs4 import BeautifulSoup, Comment
import requests
import json
my_url='https://uae.dubizzle.com/en/property-for-rent/residential/apartmentflat/?filters=(neighborhoods.ids=123)&amp;page=1'

uClient=uReq(my_url)
page_html= uClient.read()
page_soup=BeautifulSoup(page_html, 'lxml')# 'html.parser')
json_output= BeautifulSoup(str(page_soup.find_all("script",type="application/ld+json",string=re.compile("SingleFamilyResidence"))), 'lxml')#find_all("script", "application/ld+json")
json_text=json_output.get_text()
json_data = json.loads(json_text)
print json_data
uClient.close()

0
投票

在更改为json之后,首先将bs4.element.ResultSet转换为字符串

json_data = json.dumps(str(json_output))
© www.soinside.com 2019 - 2024. All rights reserved.