使用 Selenium 和 Python 单击按钮,直到不再存在

问题描述 投票:0回答:1

我正在尝试抓取此网站:https://www.vertexconnects.com/find-atc

输入任何邮政编码后,我似乎无法让 while 循环继续单击“加载更多”按钮。该代码似乎在位置行上失败,获取每个位置结果,并出现此错误

   raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: 

代码如下:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select

options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options = options)
action = ActionChains(driver)

driver.get("https://www.vertexconnects.com/find-atc")
driver.maximize_window()
wait = WebDriverWait(driver,5)
# Use below line only if you are getting the Accept/Reject cookies pop-up
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[contains(.,'Accept All')]"))).click()

location_textbox = wait.until(EC.presence_of_element_located((By.ID,"location-search-input")))
action.move_to_element(location_textbox).click().send_keys("10001").perform()
wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "atc-finder-button"))).click()

while True:
    try:
        wait.until(EC_element_to_be_clickable((By.ID, "loadMore"))).click()
    except:
        break

print("done")

locations = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='location-result']")))

for location in locations:
    name = location.find_element(By.TAG_NAME, "h4").text()
    address = location.find_element(By.CLASS_NAME, "address atc-finder-hospital-address").text()
    phone_num = location.find_element(By.TAG_Name, "a").text
    print(name, address, phone_num)
python selenium-webdriver web-scraping
1个回答
0
投票

代码中的问题:

  1. 事实上,您的

    while
    循环未有效单击 Load More 按钮,原因如下:

    1a。在下面的代码中,存在语法错误。应该是

    EC.
    而不是
    EC_

    wait.until(EC_element_to_be_clickable((By.ID, "loadMore"))).click()
    

    1b。 Selenium 无法通过

    EC.element_to_be_clickable
    找到“加载更多”按钮。改成
    EC.presence_of_element_located

  2. 在下面的代码中,最后应该是

    .text

    name = location.find_element(By.TAG_NAME, "h4").text()
    address = location.find_element(By.CLASS_NAME, "address atc-finder-hospital-address").text()
    
  3. 以下定位器策略不正确。当有多个类时,不能使用

    CLASS_NAME
    。仅供参考,
    address
    是一类,
    atc-finder-hospital-address
    是另一类。

    By.CLASS_NAME, "address atc-finder-hospital-address"
    
  4. 如果您注意到网页,并不是每家医院都有电话号码。因此,当没有找到

    <a>
    标签时,下面的代码行将会失败

    phone_num = location.find_element(By.TAG_Name, "a").text
    

这是重构后的代码:

import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options = options)

driver.get("https://www.vertexconnects.com/find-atc")
driver.maximize_window()
wait = WebDriverWait(driver,10)
# Use below line only if you are getting the Accept/Reject cookies pop-up
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[contains(.,'Accept All')]"))).click()

wait.until(EC.element_to_be_clickable((By.ID,"location-search-input"))).send_keys("10001")
wait.until(EC.element_to_be_clickable((By.CLASS_NAME, "atc-finder-button"))).click()

while True:
    try:
        wait.until(EC.presence_of_element_located((By.ID, "loadMore"))).click()
        time.sleep(3)
    except:
        break

print("done")

locations = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='location-result']")))

for location in locations:
    name = location.find_element(By.TAG_NAME, "h4").text
    address = location.find_element(By.XPATH, "//div[@class='address atc-finder-hospital-address']").text
    phone_num = location.find_elements(By.TAG_NAME, "a")
    if len(phone_num)>0:
        print(name, address, phone_num[0].text)
    else:
        print(name, address)

控制台输出:

done
Cohen Children's Medical Center 269-01 76th Avenue, New Hyde Park, NY 11040, US
Children’s Hospital of Philadelphia 269-01 76th Avenue, New Hyde Park, NY 11040, US (267) 601-3461
Dana-Farber Brigham Cancer Center 269-01 76th Avenue, New Hyde Park, NY 11040, US (877) 442-3324
Massachusetts General Hospital 269-01 76th Avenue, New Hyde Park, NY 11040, US (617) 643-9042
Boston Medical Center 269-01 76th Avenue, New Hyde Park, NY 11040, US (617) 638-8130
Children's National Hospital 269-01 76th Avenue, New Hyde Park, NY 11040, US (202) 476-5367
CLEVELAND CLINIC 269-01 76th Avenue, New Hyde Park, NY 11040, US (216) 444-5517
Nationwide Children's Hospital 269-01 76th Avenue, New Hyde Park, NY 11040, US (614) 722-6425 Option 6
Ohio State University Wexner Medical Center 269-01 76th Avenue, New Hyde Park, NY 11040, US (614) 293-3153
Cincinnati Children's Hospital Medical Center 269-01 76th Avenue, New Hyde Park, NY 11040, US (513) 517-2234
University of Chicago 269-01 76th Avenue, New Hyde Park, NY 11040, US (773) 702-6808
Northwestern Memorial Hospital 269-01 76th Avenue, New Hyde Park, NY 11040, US (312) 695-0990
The Children’s Hospital At Tristar Centennial 269-01 76th Avenue, New Hyde Park, NY 11040, US
Children's Hospital of New Orleans 269-01 76th Avenue, New Hyde Park, NY 11040, US
Medical City Dallas Hospital 269-01 76th Avenue, New Hyde Park, NY 11040, US
Methodist Hospital 269-01 76th Avenue, New Hyde Park, NY 11040, US
City of Hope National Medical Center 269-01 76th Avenue, New Hyde Park, NY 11040, US (800) 826-4673 [email protected]
Children's Hospital of Orange County (CHOC) 269-01 76th Avenue, New Hyde Park, NY 11040, US

Process finished with exit code 0
© www.soinside.com 2019 - 2024. All rights reserved.