我正在尝试抓取一个 HVAC 网站。我想输入邮政编码,点击提交按钮,然后抓取最近的经销商的结果信息。 如果有需要的话,这是网站。
我的问题是我想要的邮政编码输入正确,但无法提交。提交邮政编码后,我想检查 div 中是否有经销商数量的文本,并根据该文本选择要抓取到 csv 中的内容。
按钮的简化 HTML:
<div class="dl-zipcode-search__block">
<div class="zipcode-miles__wrap">
<div class="dl-searchbtn__wrap">
<button type="button" onclick="Search();">SEARCH</button>
</div>
</div>
</div>
带有示例邮政编码的经销商计数 div 的简化 HTML:
<div class="dl-dealer-list__block" id="dealer-list-pagination">
<div class="dl-dealer-count__block">
<span>6 Dealers near <span id="searchedZip">35005</span></span>
</div>
</div>
我尝试过:
#modules here
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import codecs
import re
from time import sleep
import csv
#importing zipcode array
from array import array
#zipcode list is 2nd file with zipcode array
from zipcode_list import zip_arr
#function that checks the dealer count div
def is_multiple_dealers(driver):
dealer = driver.find_elements(By.CSS_SELECTOR, 'div.dl-dealer-count__block:nth-child(1) > span:nth-child(1)')
#checking if I am finding the dealer
print(dealer)
#boolean that returns true or false depending on if there are 0 results, and the text
for d in dealer:
if '0 dealers near' in d.text:
return False, d.text
break
else:
return True, d.text
break
break
#main excerpt
def main():
#webdriver setup code here
wait = WebDriverWait(driver, 10)
#webpage validation code here
#zipcode array
a = zip_arr
for x in range(len(a)):
#zipcode form submitting code here
#waiting until submit button is there then submitting
submit_zipcode = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'div.dl-searchbtn__wrap > button')))
submit_zipcode.click()
#scraping with beautiful soup
soup = BeautifulSoup(page_source,features="html.parser")
dealer_boolean, dealer_quantity_text = is_multiple_dealers(driver)
if dealer_boolean == True:
print(dealer_quantity_text)
#find data and add to array
else
#add an array entry as "NAN" for no dealers in the zipcode
driver.quit()
if __name__ == "__main__":
main()
由于我已经能够使用邮政编码条目抓取另一个网站并提交按钮,所以我希望它能够工作。
最后三行回溯线:
File "C:\Path_to_file_here\Scrape_project.py", line 80, in main
dealer_boolean, dealer_quantity_text = is_multiple_dealers(driver)
TypeError: cannot unpack non-iterable NoneType object
自定义函数中的“print(dealer)”显示为
[]
所以我猜我的 CSS 选择器是错误的,或者我将如何调用它?我还尝试将“wait”作为参数添加到函数中,以使用“wait.until”使按钮可见,但这也没有帮助。
根据您提供的信息,我认为您在定位
search button
元素时遇到问题,您可以使用 xpath //button[contains(., 'SEARCH')]
来执行此操作:
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
timeout = 5 #time to wait for element to appar
browser = webdriver.Firefox()
browser.get("https://www.goodmanmfg.com/support/find-a-dealer")
input_zip_code_elem = WebDriverWait(browser, timeout).until(EC.presence_of_element_located((By.XPATH, "//input[@name='zipcode']")))
search_button_elem = WebDriverWait(browser, timeout).until(EC.presence_of_element_located((By.XPATH, "//button[contains(., 'SEARCH')]")))
input_zip_code_elem.send_keys("123456")
search_button_elem.click()