如何使用 Selenium 滚动到页面底部?

问题描述 投票:0回答:1

我正在尝试使用此代码向下滚动到页面末尾:

from selenium import webdriver

url = 'http://www.tradingview.com/screener'
driver = webdriver.Firefox()
driver.get(url)

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol') 

for index in range(len(tickers)):
    print("Row " + tickers[index].text + " ") 

但是

while
循环永远不会结束;即使到达页面底部,Selenium 仍会继续尝试向下滚动,因此程序不会继续进行。如何检测已到达页面底部以便代码可以继续?

python selenium-webdriver vertical-scrolling
1个回答
0
投票

在代码下方,它告诉您表中有多少行(匹配项)。因此,一种选择是将可见行数与总行数进行比较。当达到该数量(可见行数)时,您将退出循环。

url = 'http://www.tradingview.com/screener'
driver = webdriver.Firefox()
driver.get(url)

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

selector = '.js-field-total.tv-screener-table__field-value--total'
matches = driver.find_element_by_css_selector(selector)
matches = int(matches.text.split()[0])

visible_rows = 0
scrolls = 0

while visible_rows < matches:

    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait 10 scrolls before updating row information 
    if scrolls == 10:
        table = driver.find_elements_by_class_name('tv-data-table__tbody')
        visible_rows = len(table[1].find_elements_by_tag_name('tr'))
        scrolls = 0

    scrolls += 1

# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol') 

for index in range(len(tickers)):
   print("Row " + tickers[index].text + " ") 

编辑:由于您的设置似乎不允许使用以前的解决方案,因此您可以尝试以下不同的方法。该页面一次加载 150 行。因此,我们可以使用预期的总匹配数/行数(例如 4894)并将其除以 150 来获得需要滚动的次数,而不是计算可见行数。如果我们滚动至少那么多次,理论上,所有行都应该可见,我们可以继续代码。

from time import sleep
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

url = 'http://www.tradingview.com/screener'
driver = webdriver.Chrome('./chromedriver')
driver.get(url)

try:

    selector = '.js-field-total.tv-screener-table__field-value--total'
    condition = EC.visibility_of_element_located((By.CSS_SELECTOR, selector))
    matches = WebDriverWait(driver, 10).until(condition)
    matches = int(matches.text.split()[0])

except (TimeoutException, Exception):
    print ('Problem finding matches, setting default...')
    matches = 4895 # Set default

# The page loads 150 rows at a time; divide matches by
# 150 to determine the number of times we need to scroll;
# add 5 extra scrolls just to be sure
num_loops = int(matches / 150 + 5)

for _ in range(num_loops):

    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    sleep(2) # Pause briefly to allow loading time

# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol') 

n_tickers = len(tickers)

msg = 'Correct ' if n_tickers == matches else 'Incorrect '
msg += 'number of tickers ({}) found'
print(msg.format(n_tickers))

for index in range(n_tickers):
    print("Row " + tickers[index].text + " ")
© www.soinside.com 2019 - 2024. All rights reserved.