单击多个页面上的按钮后如何刮取数字?

问题描述 投票:0回答:4

以前我问过如何点击页面上的按钮。它第一次工作,但我意识到,有时它有时它不起作用。问题是我有多个页面,有时我得到一些页面的数字,但对于一些我什么也得不到。有没有办法获得我需要的所有数据? Project是我在Python的这个初学者课程的期末考试。

需要单击的按钮位于页面的右上方,并显示文本“Prikažibloj”。这是我的尝试,但它不能按我的意愿工作:

condos = [
'https://www.nekretnine.rs/stambeni-objekti/stanovi/vracar-lokacija-juzni-bulevar-adresa-vojvode-hrvoja-beograd/1958955/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/vozdovac-autokomanda-trise-kaclerovica-90m2-trise-kaclerovica/NkvU3_gZyb6/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/vracar-prote-mateje-78m2-id1187/NkwQVDgJqsw/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/palilula-botanicka-basta-bulevar-despota-stefana-60m2-bulevar-despota-stefana/1734451/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/palilula-postanska-stedionica-dalmatinska-94m2-dalmatinska/Nk1bTYWifZj/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/stari-grad-kalemegdan-strahinjica-bana-37m2-strahinjica-bana/NklcRCutVNB/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/palilula-borca-moravske-divizije-73m2-moravske-divizije/207667/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/palilula-visnjicka-banja-slobodana-jovanovica-75m2-slobodana-jovanovica/Nk2nu-zdbzW/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/zvezdara-mirijevo-jovanke-radakovic-61m2-jovanke-radakovic/NkW5Qg22seE/',
'https://www.nekretnine.rs/stambeni-objekti/stanovi/zvezdara-deram-pijaca-duke-dinic-80m2-duke-dinic/Nk26as4b71N/']

condo_agency_home_phones = []
condo_agency_cell_phones = []

options = Options()
options.headless = False
driver = webdriver.Chrome('/Users/Nenad/chromedriver', options=options)
for condo in condos:
    driver.get(condo)
    try:
        element = driver.find_element_by_css_selector('body > div:nth-child(14) > div.row.pt-4 > div.col-lg-4.mb-5 > div.border-box.pt-3.pl-3.pr-3.pb-0.d-none.d-lg-block > div > div.row > div.col-12.col-sm-6.contact-footer > div > div > form:nth-child(2) > button').click()
        sleep(randint(3, 5))
        element2 = driver.find_element_by_css_selector('body > div:nth-child(14) > div.row.pt-4 > div.col-lg-4.mb-5 > div.border-box.pt-3.pl-3.pr-3.pb-0.d-none.d-lg-block > div > div.row > div.col-12.col-sm-6.contact-footer > div > div > form:nth-child(4) > button').click()
        sleep(randint(3, 5))
        home_phone = driver.find_element_by_css_selector('body > div:nth-child(14) > div.row.pt-4 > div.col-lg-4.mb-5 > div.border-box.pt-3.pl-3.pr-3.pb-0.d-none.d-lg-block > div > div.row > div.col-12.col-sm-6.contact-footer > div > div > form:nth-child(2) > span')
        cell_phone = driver.find_element_by_css_selector('body > div:nth-child(14) > div.row.pt-4 > div.col-lg-4.mb-5 > div.border-box.pt-3.pl-3.pr-3.pb-0.d-none.d-lg-block > div > div.row > div.col-12.col-sm-6.contact-footer > div > div > form:nth-child(4) > span')
        condo_agency_home_phones.append(home_phone.text)
        condo_agency_cell_phones.append(cell_phone.text)
    except:
        condo_agency_home_phones.append('NaN')
        condo_agency_cell_phones.append('NaN')

我得到的解决方案是:

element = driver.find_element_by_css_selector('button[type="button"]').click()

有时这点击按钮,我仍然不知道如何在点击后提取数字。如果有人知道该怎么做,请告诉我。

python selenium web-scraping
4个回答
0
投票

使用WebDriverWait来处理动态元素。但是你需要给它一些时间。单击按钮后得到整个电话号码。睡眠(1)。

condo_agency_home_phones = []
condo_agency_cell_phones = []
 for condo in condos:
        driver.get(condo)

        try:
            wait=WebDriverWait(driver,10)
            element =wait.until(expected_conditions.element_to_be_clickable((By.XPATH,"//button[contains(text(),'broj')]")))
            element.click()
            time.sleep(1)
            home_phone=wait.until(expected_conditions.element_to_be_clickable((By.XPATH,"(//span[@class='cell-number'])[1]")))
            condo_agency_home_phones.append(home_phone.text)

            wait1 = WebDriverWait(driver, 10)
            element2 =wait1.until(expected_conditions.element_to_be_clickable((By.XPATH,"//button[contains(text(),'broj')]")))
            element2.click()
            time.sleep(1)
            wait2 = WebDriverWait(driver, 10)
            cell_phone=wait2.until(expected_conditions.element_to_be_clickable((By.XPATH,"(//span[@class='cell-number'])[2]")))
            condo_agency_cell_phones.append(cell_phone.text)
        except:
            condo_agency_home_phones.append('NaN')
            condo_agency_cell_phones.append('NaN')

    print(condo_agency_home_phones,condo_agency_cell_phones)

请注意,您需要使用以下导入。

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium import webdriver
import time

0
投票

欢迎来到SO。以下是选项。

选项1:使用预期条件(通过这种方式,您确保在单击之前找到元素)

wait = WebDriverWait(self.driver, 10)
ele = wait.until(EC.presence_of_element_located((By.XPATH, "//button[.='Prikaži broj']")))
ele.click

选项2:使用Javascript。 (这就像在按钮上发送点击事件一样)

ele = driver.find_element_by_xpath("//button[.='Prikaži broj']")
driver.execute_script("arguments[0].click();",ele);

0
投票

您也可以考虑尝试等待可点击

WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "form button[type=button]"))).click()

额外进口:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

0
投票

以下代码涉及Adblock,随之而来的是大部分时间所有数字:

path_to_extension = r'C:\Users\Nenad\Desktop\3.42.0_0'
options = Options()
options.add_argument('load-extension=' + path_to_extension)
options.headless = False
driver = webdriver.Chrome('/Users/Nenad/chromedriver', options=options)
driver.create_options()

扩展路径从以下位置复制:

C:\ Users \ Nenad \ AppData \ Local \ Google \ Chrome \ User Data \ Default \ Extensions \ gighmmpiobklfepjocnamgkkbiglidom

我认为这是有效的解决方案。

© www.soinside.com 2019 - 2024. All rights reserved.