我正在尝试仅检索以下页面上公司的链接:https://clutch.co/it-services/msp
[这似乎是一个常见问题,我花了一整天时间审查其他帖子,但没有获得任何成功。
代码:
links = []
for l in soup.find_all(class_='website-link website-link-a'):
results = (l.get('href'))
links.append(results)
print(links)
输出:
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
仅打印soup.find_all
的结果时,会得到:
<a data-extlink-pid="1219089" href="https://fulcrumdigital.com/" rel="nofollow" target="_blank">
<i class="icon icon-visit-site"></i><span class="">Visit Website</span>
</a>
</li>, etc, etc,
我需要在href后面提取内容,但无法弄清楚如何提取。任何建议都将不胜感激。
您可以使用CSS选择器'.website-link-a > a'
(使用<a>
在标签下直接选择每个class="website-link-a"
标签:]
import requests
from bs4 import BeautifulSoup
url = 'https://clutch.co/it-services/msp'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
for a in soup.select('.website-link-a > a'):
print(a['href'])
打印:
http://electric.ai/
http://www.symphony-solutions.com/?utm_source=clutch.co&utm_medium=referral&utm_campaign=it-services-msp
https://www.bairesdev.com/?utm_source=clutch.co&utm_medium=referral&utm_campaign=msp
https://www.helixstorm.com/?utm_source=clutch.co&utm_medium=referral&utm_campaign=it-services-msp
http://www.sundevs.com/?utm_source=clutch.co&utm_medium=referral&utm_campaign=it-services-msp
http://www.computersolutionseast.com/?utm_source=clutch.co&utm_medium=referral&utm_campaign=it-services-msp
/your-project
http://techmd.com
http://www.sugarshot.io/?utm_source=clutch.co&utm_medium=referral&utm_campaign=directory
https://www.empist.com?utm_source=clutch.co&utm_medium=referral
http://www.frameworkIT.com/?utm_source=clutch.co&utm_medium=referral
https://www.clickittech.com/
https://cyberduo.com/?utm_source=clutch.co&utm_medium=referral&utm_campaign=it-services-msp
http://www.realnets.com/?utm_source=clutch.co&utm_medium=referral
https://www.ibexlabs.com/?utm_source=clutch.co&utm_medium=referral
https://bianor.com/
http://www.endpoint.com/?utm_source=clutch.co&utm_medium=referral
https://devopsprodigy.com/?utm_source=clutch.co&utm_medium=referral&utm_campaign=directory
https://vrpconsulting.com/
https://siliconreef.co.uk/?utm_source=clutch.co&utm_medium=referral
http://www.agencypartner.com?utm_source=clutch&utm_medium=profile&utm_campaign=directory_listing