driver.get("https://tinhte.vn/thread/tong-hop-16-tin-don-moi-nhat-ve-iphone-16.3739136/")
userid = driver.find_elements(By.XPATH, "//*[contains(@href, '/profile/')]").get_attribute("href")
print(userid)
这是我的代码,我的目标是从网站上抓取所有用户名及其 ID。用户个人资料链接格式为@href 标签中的https://tinhte.vn/profile/username.123456/。
我在谷歌上搜索过,知道我无法获取多个元素的属性。那么还有什么替代方法呢?
检查下面的代码及其内嵌解释:
driver.get("https://tinhte.vn/thread/tong-hop-16-tin-don-moi-nhat-ve-iphone-16.3739136/")
driver.maximize_window()
wait = WebDriverWait(driver, 15)
# Click on Consent button
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[@aria-label='Consent']"))).click()
# Store all users into a variable called `userid`
userid = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "//*[contains(@href, '/profile/')]")))
# declare an array called users
users = []
# iterate through each web element in the userid array and get the href attribute of it and store in users array
for user in userid:
users.append(user.get_attribute("href"))
# Print the array
print(users)
控制台输出:
['https://tinhte.vn/profile/vnninja.43700/', 'https://tinhte.vn/profile/vnninja.43700/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/cuteo.2549422/', 'https://tinhte.vn/profile/cuteo.2549422/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/xecatang.2612547/', 'https://tinhte.vn/profile/bao-sai-gon.2928116/', 'https://tinhte.vn/profile/bao-sai-gon.2928116/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/bao-sai-gon.2928116/', 'https://tinhte.vn/profile/jinnie-ktl.2392977/', 'https://tinhte.vn/profile/jinnie-ktl.2392977/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/lekhanh1504.1717481/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/penguin-small.1673674/', 'https://tinhte.vn/profile/naturelovely9.206218/', 'https://tinhte.vn/profile/naturelovely9.206218/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/soj.2649515/', 'https://tinhte.vn/profile/naturelovely9.206218/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/airwalker.2543950/', 'https://tinhte.vn/profile/airwalker.2543950/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/namphuong000.1462961/', 'https://tinhte.vn/profile/namphuong000.1462961/', 'https://tinhte.vn/profile/truong0977.2369659/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/%E2%80%9Cbenh-vien-tra-ve%E2%80%9D-da-noi-rang.2991590/', 'https://tinhte.vn/profile/%E2%80%9Cbenh-vien-tra-ve%E2%80%9D-da-noi-rang.2991590/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/lethangk47.1319353/', 'https://tinhte.vn/profile/benh-vien-tra-ve-da-noi-rang.2991590/', 'https://tinhte.vn/profile/mrbinhta.2970728/', 'https://tinhte.vn/profile/mrbinhta.2970728/', 'https://tinhte.vn/profile/kennyn73.185469/', 'https://tinhte.vn/profile/kennyn73.185469/', 'https://tinhte.vn/profile/thienduongld.496304/', 'https://tinhte.vn/profile/thienduongld.496304/', 'https://tinhte.vn/profile/kennyn73.185469/', 'https://tinhte.vn/profile/evilartist.2379222/', 'https://tinhte.vn/profile/evilartist.2379222/', 'https://tinhte.vn/profile/thienduongld.496304/', 'https://tinhte.vn/profile/kaitokid1908.2575569/', 'https://tinhte.vn/profile/kaitokid1908.2575569/', 'https://tinhte.vn/profile/nguyennguyen0127.2933297/', 'https://tinhte.vn/profile/nguyennguyen0127.2933297/', 'https://tinhte.vn/profile/grozar.274181/', 'https://tinhte.vn/profile/grozar.274181/', 'https://tinhte.vn/profile/huong-giang-trang.1777251/', 'https://tinhte.vn/profile/huong-giang-trang.1777251/', 'https://tinhte.vn/profile/hoanglong213.2692496/', 'https://tinhte.vn/profile/hoanglong213.2692496/', 'https://tinhte.vn/profile/tuananhcao.2460888/', 'https://tinhte.vn/profile/tuananhcao.2460888/', 'https://tinhte.vn/profile/dlcr.537546/', 'https://tinhte.vn/profile/dlcr.537546/', 'https://tinhte.vn/profile/iphone2g-lock.2442834/', 'https://tinhte.vn/profile/iphone2g-lock.2442834/', 'https://tinhte.vn/profile/tientran517.2706203/', 'https://tinhte.vn/profile/tientran517.2706203/', 'https://tinhte.vn/profile/teslaspacex.1738728/', 'https://tinhte.vn/profile/teslaspacex.1738728/']
Process finished with exit code 0