我是 Selenium 的新手,正在尝试提取页面上的信息。但是我无法提取我需要的所有相关信息。
下面是我的代码示例:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://www.morphmarket.com/all/c/reptiles/pythons/ball-pythons")
time.sleep(5)
snakes = driver.find_elements(By.CSS_SELECTOR,"a.animalCard--avL0R")
link = snakes[0].get_attribute("href")
driver.get(snakes[0].get_attribute("href"))
time.sleep(5)
genes= driver.find_element(By.CSS_SELECTOR, "h1.animalTitle--cH6qE")
print(genes.text)
snake=driver.find_element(By.CSS_SELECTOR,"h2.animalSubTitle--mhYId")
print(snake.text)
price = driver.find_element(By.CSS_SELECTOR, "h1.salePrice--qNIIs")
print(price.text)
#sex = driver.find_element(By.TAG_NAME, "span")
Birth = driver.find_elements(By.CSS_SELECTOR,"div.labelValueContainer--z1CP3")
print(Birth[1].text)
print(Birth[3].text)
print(Birth[4].text)
print(Birth[5].text)
print(Birth[6].text)
print(Birth[7].text)
print(Birth[8].text)
print(Birth[9].text)
print(Birth[10].text)
print(Birth[11].text)
Company= driver.find_element(By.CSS_SELECTOR, "h4.title--qLioF")
print(Company.text)
Location=driver.find_element(By.CSS_SELECTOR,"p.location--TtVtP")
print(Location.text)
membership= driver.find_element(By.CSS_SELECTOR, "span")
我可以提取一些信息,但是如何提取页面上的性别、公司、位置和会员信息?
您可以使用他们的 Ajax API 来获取结果,例如:
import requests
api_url = "https://www.morphmarket.com/api/v1/listings/"
params = {
"category": "bps",
"page": "1",
"page_size": "24",
"state": "for_sale",
"view": "grid",
}
data = requests.get(api_url, params=params).json()
# print(data)
for r in data["results"]:
print(f"{r['title'][:50]:<50} {r['price']}")
打印:
Sacred 850.0
Pied 66% Het Clown 450.0
Clown 66% Het Pied 450.0
Pastel Enchi Freeway 500.0
Yellow Belly Het Dg Het Hypo Het Pied 1200.0
2021 0.1 Pastel Chocolate Enchi Desert Ghost Hypo 4800.0
Pastel Stranger 50% Het Clown 1250.0
Super Banana Enchi Leopard 800.0
Pastel Stranger 50% Het Clown 1250.0
Pinstripe Enchi Het Dg Het Hypo Het Pied 400.0
Ultramel Banana 66% Poss Het GeneticStripe 498.0
Black Head Fire Specter 300.0
Super Pastel Leopard Fire Clown 450.0
Pastel Leopard Stranger 50% Het Clown 1650.0
Pastel Mahogany Super Redstripe 50% Double Het Clo 1250.0
Pastel Spider Clown 66% Het Axanthic 350.0
Pastel Specter Black Head 375.0
Orange Dream Cypress Mojave Pastel Probable Fire A 450.0
Coral Glow Hidden Gene Woma Granite Enchi Odium Fa 500.0
Pastel Super Ghi 100% Het Clown 350.0
Yb Scarecrow 5900.0
Cypress Fire Het Clown 550.0
Ghost Het Pied 100.0
Clown 300.0
Andrej Kesely 的解决方案是快速且理想的解决方案。但是,如果您具体使用硒来刮除它,请参阅下面的硒解决方案。
使用 Explicit Waits 而不是
time.sleep()
检查下面的优化代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get("https://www.morphmarket.com/all/c/reptiles/pythons/ball-pythons")
wait = WebDriverWait(driver,10)
snakes = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,"a.animalCard--avL0R")))
link = snakes[0].get_attribute("href")
driver.get(link)
genes = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "h1.animalTitle--cH6qE")))
print(genes.text)
snake = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "h2.animalSubTitle--mhYId")))
print(snake.text)
price = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "h1.salePrice--qNIIs")))
print(price.text)
Birth = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,"div.labelValueContainer--z1CP3")))
for birth in Birth:
print(birth.text)
Company_info = wait.until(EC.visibility_of_all_elements_located((By.XPATH, "(//div[@class='infoWrapper--O_L9E'])[2]")))
for element in Company_info:
print(element.text)
控制台结果:
Enchi Pinstripe Het Dg Het Hypo Het Pied
Ball Pythons Baby
$1,200.00
Sex:
Traits:
Enchi
Pinstripe
Het Desert Ghost
Het Hypo
Het Piebald
Origin:
Self Produced
Birth:
2022
Weight:
280g
Diet:
Frozen/Thawed Rat
Shipping:
Free
Shipping Details:
Regional Shipping
Animal ID:
23-115-19
First Posted:
12/18/23
Last Renewed:
02/28/24
Last Updated:
02/28/24
ML Exotics
5.0
(126)
Taunton, Massachusetts
Pro Member
Process finished with exit code 0