使用 Selenium 使用“data-testid”属性获取元素

问题描述 投票:0回答:1

我正在尝试使用以下 data-testid 获取 HTML 中每个元素的数据,如下所示:

<div data-testid="sl.explore.card-description"</div>

我写的代码如下,但不起作用:

desc_list = []

desc = driver.find_elements(By.XPATH, "//div[@data-testid='sl.explore.card-description']") 

for i, cp in enumerate(desc):
    splitted = desc[i].text.split('\n');           
    data_desc = str(splitted[0:1])
    desc_list.append(data_desc);    
    df_desc = pd.DataFrame(desc_list)
python selenium-webdriver
1个回答
0
投票

代码中的主要问题可能在于访问元素和提取文本的方式。检查您的 XPath 以确保它选择正确的元素。此外,请确保您选择的元素具有可提取的文本内容。

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By

# You must have initialized your Selenium WebDriver instance as a 'driver'

desc_list = []

# Use find_elements instead of find_element to get a list of elements
desc = driver.find_elements(By.XPATH, "//div[@data-testid='sl.explore.card-description']")

for cp in desc:
    # Check if an element has text content before splitting it
    if cp.text:
        # Split text content with newline characters
        splitted = cp.text.split('\n')
        # Extract the desired data (for example, the first row)
        data_desc = splitted[0]
        desc_list.append(data_desc)

# Create the DataFrame after the loop to avoid creating unnecessary DataFrames inside the loop
df_desc = pd.DataFrame(desc_list, columns=['Description'])

确保您的 XPath 实际上选择了您想要的元素,并确保它具有可提取的文本内容。

© www.soinside.com 2019 - 2024. All rights reserved.