如何使用硒从每个手风琴部分中抓取所有文本?

问题描述 投票:0回答:1

我正在使用硒从网站上刮掉一些价格,但是我使用的代码似乎并没有迭代每个手风琴块。我想要的只是每次治疗的标题、项目和价格。

如何做到这一点?

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from selenium.common.exceptions import NoSuchElementException

service_obj = Service(r"C:\chromedriver.exe")
options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(service=service_obj, options=options)

url = 'https://www.23dental.com/plans-fees/dental-fees'

driver.implicitly_wait(2)
#Reject Cookies
RejectCookiesButton = driver.find_element(By.ID,"onetrust-reject-all-handler")
RejectCookiesButton.click()

#Expand all sections
ExpandAccordions = driver.find_elements(By.CLASS_NAME,'material-icons')
for ExpandAccordion in ExpandAccordions:
    ExpandAccordion.click()


DentalFees = []

practices = driver.find_elements(By.CLASS_NAME, 'col-12')
for practice in practices:
    prices = driver.find_element(By.CLASS_NAME,'accordion-content').text
    DentalFees.append(prices)
driver.close()

print(DentalFees)
python selenium-webdriver web-scraping selenium-chromedriver
1个回答
0
投票

我将提供一种稍微不同的方法。我查看了网站,不需要使用 Selenium 来获取这些数据。您不需要使用 Selenium,因为您想要的数据是在初始 HTML 中发送的。这是一篇关于我正在谈论的内容的好文章

这里有一些代码可以获取您正在寻找的内容:

import requests
from lxml import html

resp = requests.get('https://www.23dental.com/plans-fees/dental-fees')

tree = html.fromstring(resp.text)

for item in tree.xpath('//div[contains(@class, "pricing")]//tr'):
    product = item.xpath('./td[1]/text()')[0]
    price = item.xpath('./td[2]/text()')[0]

    print(f'{product = } {price = }')

输出:

product = 'New patient examination' price = '£75.00'
product = 'Exisitng patient examination' price = '£59.00'
product = 'Small x-ray' price = '£29.00'
product = 'Large x-ray' price = '£69.00'
product = 'Hygiene appointment 30 mins' price = '£85.00'
product = 'Extraction (simple)' price = '£119.00'
product = 'Extraction (complex)' price = '£199.00'
product = 'Simple fillings (silver)' price = 'from £109.00'
product = 'Simple fillings (white)' price = 'from £135.00'
product = 'Root canal treatment' price = 'from £269.00'
product = 'Crown' price = 'from £725.00'
product = 'Bridge (per unit)' price = 'from £679.00'
product = 'Full upper or lower denture' price = 'from £765.00'
product = 'Consultation' price = 'Free'
product = 'Implant placement' price = 'from £2,195.00'
product = 'Dental veneers' price = '£695.00'
product = 'Teeth whitening' price = '£400.00'
product = 'Band 1' price = '£23.80'
product = 'Band 2' price = '£65.20'
product = 'Band 3' price = '£282.80’

您可能需要安装requests和lxml:

pip install requests 
pip install lxml
© www.soinside.com 2019 - 2024. All rights reserved.