没有点击所有标签而没有循环一次问题

问题描述 投票:6回答:2

我想点击网页上的标签,如下所示。不幸的是,它似乎只是点击一些选项卡,尽管在检查Chrome时正确的xpath是正确的。我只能假设它没有单击所有选项卡,因为没有使用完整的xpath。 enter image description here

但是..我试过改变xpath:

//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"]致:

//div[@class='KambiBC-event-groups-list']//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"] FOR:

clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH,'(//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"])[%s]' % str(index + 1))))    

但问题仍然存在。我也尝试过使用CSS:

#KambiBC-contentWrapper__bottom > div > div > div > div > div.KambiBC-quick-browse-container.KambiBC-quick-browse-container--list-only-mode > div.KambiBC-quick-browse__list.KambiBC-delay-scroll--disabled > div > div.KambiBC-time-ordered-list-container > div.KambiBC-time-ordered-list-content > div > div > div.KambiBC-collapsible-container.KambiBC-mod-event-group-container > header

然而,这一直给我错误...对于:

clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,'("#KambiBC-contentWrapper__bottom > div > div > div > div > div.KambiBC-quick-browse-container.KambiBC-quick-browse-container--list-only-mode > div.KambiBC-quick-browse__list.KambiBC-delay-scroll > div > div.KambiBC-time-ordered-list-container > div.KambiBC-time-ordered-list-content > div > div > div > header")[%s]' % str(index + 1))))

应该注意的是,我想单击所有未打开的选项卡,我似乎无法使用CSS选择器来查找足够的特定元素,因为我认为它不允许在这种情况下缩小类元素。

有没有办法解决这个不点击一切的问题?

应该指出的是,我正在使用......

索引中的索引:

indexes = [index for index in range(len(options))]
shuffle(indexes)
for index in indexes:

是否有更优雅的方式用于1循环?

[import sys
sys.exit()][1]

完整的code

python css selenium xpath selenium-webdriver
2个回答
4
投票

这循环遍历每个联赛1比1的所有比赛,根据需要收集所有相关数据。您可以通过在每个查询前加上.并通过match.find_element_by_xpath('.//your-query-here')选择匹配来收集每个匹配中的更多数据。让我知道这是否成功!

import sys, io, os, csv, requests, time
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from selenium import webdriver

driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()

try:
    os.remove('vtg121.csv')
except OSError:
    pass

driver.get('https://www.unibet.com.au/betting#filter/football')
time.sleep(1)

clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, 
    ('//div[@class="KambiBC-collapsible-container '\
    'KambiBC-mod-event-group-container"]'))))
time.sleep(0)

xp_opened = '//div[contains(@class, "KambiBC-expanded")]'
xp_unopened = '//div[@class="KambiBC-collapsible-container ' \
    'KambiBC-mod-event-group-container" ' \
    'and not(contains(@class, "KambiBC-expanded"))]'
opened = driver.find_elements_by_xpath(xp_opened)
unopened = driver.find_elements_by_xpath(xp_unopened)

data = []
for league in opened:
    xp_matches = './/li[contains(@class,"KambiBC-event-item")]'
    matches = league.find_elements_by_xpath(xp_matches)

    try:
        # League Name
        xp_ln = './/span[@class="KambiBC-mod-event-group-header__main-title"]'
        ln = league.find_element_by_xpath(xp_ln).text.strip()
    except:
        ln = None
    print(ln)

    for match in matches:
        # get all the data per 'match group'
        xp_team1_name = './/button[@class="KambiBC-mod-outcome"][1]//' \
            'span[@class="KambiBC-mod-outcome__label"]'
        xp_team1_odds = './/button[@class="KambiBC-mod-outcome"][1]//' \
            'span[@class="KambiBC-mod-outcome__odds"]'
        xp_team2_name = './/button[@class="KambiBC-mod-outcome"][3]//' \
            'span[@class="KambiBC-mod-outcome__label"]'
        xp_team2_odds = './/button[@class="KambiBC-mod-outcome"][3]//' \
            'span[@class="KambiBC-mod-outcome__odds"]'

        try:
            team1_name = match.find_element_by_xpath(xp_team1_name).text
        except:
            team1_name = None

        try:
            team1_odds = match.find_element_by_xpath(xp_team1_odds).text
        except:
            team1_odds = None

        try:
            team2_name = match.find_element_by_xpath(xp_team2_name).text
        except:
            team2_name = None

        try:
            team2_odds = match.find_element_by_xpath(xp_team2_odds).text
        except:
            team2_odds = None

        data.append([ln, team1_name, team1_odds, team2_name, team2_odds])

for league in unopened:
    league.click()
    time.sleep(0.5)
    matches = league.find_elements_by_xpath(xp_matches)

    try:
        ln = league.find_element_by_xpath(xp_ln).text.strip()
    except:
        ln = None
    print(ln)

    for match in matches:
        try:
            team1_name = match.find_element_by_xpath(xp_team1_name).text
        except:
            team1_name = None

        try:
            team1_odds = match.find_element_by_xpath(xp_team1_odds).text
        except:
            team1_odds = None

        try:
            team2_name = match.find_element_by_xpath(xp_team2_name).text
        except:
            team2_name = None

        try:
            team2_odds = match.find_element_by_xpath(xp_team2_odds).text
        except:
            team2_odds = None

        data.append([ln, team1_name, team1_odds, team2_name, team2_odds])

with open('vtg121.csv', 'a', newline='', encoding="utf-8") as outfile:
    writer = csv.writer(outfile)
    for row in data:
        writer.writerow(row)
        print(row)

1
投票

OP's code without extra imports

发生错误是因为site对标签OP的XPath不是连续的。它有差距。例如,现在我找不到

// * [@ ID = “KambiBC-contentWrapper__bottom”] / DIV / DIV / DIV / DIV / DIV [3] / div1 / DIV / DIV [3] / DIV [2] / DIV / DIV / DIV [2] /头

不久之前,在比赛开始之前,我找不到

// * [@ ID = “KambiBC-contentWrapper__bottom”] / DIV / DIV / DIV / DIV / DIV [3] / div1 / DIV / DIV [3] / DIV [2] / DIV / DIV / DIV [1] /头

当我谈到index时,我指的是上面的大胆部分。

当游戏上线时,标签会突然将索引从2变为1.(粗体部分会发生变化。)在这两种情况下,都存在间隙:无法找到1或无法找到2。

我想,有这个差距的原因是因为中间还有另一个不可点击的元素。见下图。 enter image description here

league是造成差距的原因。因此,只要代码命中league占据的索引,它就会超时。因为League按钮和其他标签切换League和实时游戏的位置,所以当位置变化时,指数会被交换。 (我认为这就是为什么我找不到Xpath,大胆的部分首先是1,后来才发现是2)。

以下是OP代码的一部分。最后你可以看到str(index + 1)。

indexes = [index for index in range(len(options))] # 
shuffle(indexes) # the OP use shuffle from random. Still 0 and 1 is contained.
path = '(//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"])'
for index in indexes:
    # Because there are some indexes are missing because of League button,
    # nothing can be found at the index and it times out.
    clickMe = wait(driver, 10).until(
EC.element_to_be_clickable((By.XPATH, path + '[%s]' % str(index + 1))))

尝试捕获超时异常以跳过League占用的索引。您还可以保留一个计数器,以便只允许在一个页面上捕获一个超时异常。如果有第二次超时,你知道除了League按钮之外还有其他问题,应该停止。

from selenium import webdriver 
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
driver = webdriver.Firefox()
driver.set_window_size(1024, 600)
driver.maximize_window()
wait = WebDriverWait 

driver.get('https://www.unibet.com.au/betting#filter/football')
time.sleep(5)

options = driver.find_elements_by_xpath("""//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div[1]/div/div[3]/div[2]/div/div/div""")

print("Total tabs that we want to open is {}".format(len(options)))
indexes = [index for index in range(len(options))]

for index in indexes:
    print(index)
    try:
        clickMe = wait(driver, 5).until(EC.presence_of_element_located((By.XPATH,
            """//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div[1]/div/div[3]/div[2]/div/div/div[{}]/header""".format(str(index+1)))))
        clickMe.click()
    except TimeoutException as ex:
        print("catch you! {}".format(index))
        pass
© www.soinside.com 2019 - 2024. All rights reserved.