seleenium python滚动动态表

问题描述 投票:0回答:1
一般而言,您需要在客户端上执行JavaScript才能滚动元素。例如,

driver.execute_script("window.scrollTo(0, 500);") wo,您的情况更加复杂,因为这是一个虚拟化表,在我们滚动时,元素加载和删除了元素。因此,我们需要找到数据表,逐步滚动并沿途收集元素。我们不能只滚动向下滚动,然后收集所有内容,因为它们也被删除。 当我们一点一点地滚动时,每个卷轴上表中加载的元素之间有必要重叠,因此我们需要注意删除重复项。 我构建了一个适用于您的页面的简单示例:

driver = webdriver.Chrome(service=Service()) driver.get("https://data.nordpoolgroup.com/auction/day-ahead/prices?deliveryDate=2025-01-15&currency=EUR&aggregation=DailyAggregate&deliveryAreas=AT,FR") # close the cookie banner cookie_button = WebDriverWait(driver, 5).until( EC.element_to_be_clickable((By.CSS_SELECTOR, '#cdk-overlay-0 .btn')) ) cookie_button.click() # make sure the data grid is visible WebDriverWait(driver, 5).until( EC.visibility_of_element_located((By.ID, 'dailyAggregateGrid')) ) # we scroll incrementally until we reach the full scroll height scroll_max_height = driver.execute_script("return document.querySelector('#dailyAggregateGrid .dx-scrollable-container').scrollHeight") print(f"Scroll heigth is: {scroll_max_height}") scroll_increment = 200 scroll_height = 0 while scroll_height < scroll_max_height: scroll_height += scroll_increment print(f"Scrolling to {scroll_height}") driver.execute_script(f"document.querySelector('#dailyAggregateGrid .dx-scrollable-container').scrollTo(0, {scroll_height})") # wait some time to load the elements. Alternatively, you can also watch for changes time.sleep(1) # todo: read the elements here. Make sure to handle duplicates as there's some overlap item_dates = driver.find_elements(By.CSS_SELECTOR, "#dailyAggregateGrid .dx-datagrid-first-header") print([i.text for i in item_dates])

希望这有帮助!

python selenium-webdriver web-scraping
1个回答
0
投票
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.