我正试图从网站下载一些图像(比如前10个)。问题是我不知道html是如何工作的。
到目前为止我做了什么:
from selenium import webdriver
import time
driver = webdriver.Chrome("C:\web_driver\chromedriver")
url = "https://9gag.com/"
driver.get(url)
time.sleep(5)
driver.find_element_by_xpath("/html/body/div[7]/div[1]/div[2]/div/div[3]/button[2]/span").click()
images = driver.find_elements_by_tag_name('img')
list = []
for image in images:
print(image.get_attribute('src'))
list.append(image.get_attribute('src'))
我想在页面中心下载图像,但程序只是检索左侧边栏上的图像。我试图解决这个问题的方法是:
from selenium import webdriver
import time
driver = webdriver.Chrome("C:\web_driver\chromedriver")
url = "https://9gag.com/"
driver.get(url)
time.sleep(5)
# this part is to close the cookies pop up
driver.find_element_by_xpath("/html/body/div[7]/div[1]/div[2]/div/div[3]/button[2]/span").click()
images = driver.find_element_by_class_name("page").get_attribute("img")
list = []
for image in images:
print(image.get_attribute('src'))
# list.append(image.get_attribute('src'))
# print("list:", list)
time.sleep(1)
但我收到以下错误:
Traceback (most recent call last):
File "C:/Users/asus/PycharmProjects/project1/36.py", line 14, in <module>
for image in images:
TypeError: 'NoneType' object is not iterable
Process finished with exit code 1
<div class=page>
不包含任何img
属性。你必须寻找<img>
标签find_element_by_
只返回一个元素。要获取元素列表,您必须使用find_elements_by_
。这就是你得到错误的原因。//div[contains(@id,'stream-')]//div[@class='post-container']//picture/img
gif
s不是图像或<image>
标记内。因此,您只能通过此方法获取静止图像。试试这个:
images = driver.find_elements_by_xpath("//div[contains(@id,'stream-')]//div[@class='post-container']//picture/img")
list = []
for image in images:
print(image.get_attribute('src'))
list.append(image.get_attribute('src'))
它会将所有找到的图像源放到列表中。