我正在尝试从此网站下载 csv 数据,这看起来很简单,但 selenium 无法找到该按钮。我已经下载了页面源代码并验证了它的存在,所以我不确定问题是什么。
这是元素-
<button class="btn exportBTN btn-default" id="_eucleflegislationlist_WAR_euclefportlet_exportButtonCSV" name="_eucleflegislationlist_WAR_euclefportlet_exportButtonCSV" onclick="_eucleflegislationlist_WAR_euclefportlet_exportData('csv');" type="button">
<span class="lfr-btn-label">CSV</span>
</button>
这是我的代码:
# Set up the Chrome WebDriver
driver = webdriver.Chrome(options=chrome_options)
# Open the webpage
driver.get('https://www.echa.europa.eu/web/guest/cosmetics-prohibited-substances?p_p_id=eucleflegislationlist_WAR_euclefportlet&p_p_lifecycle=0&p_p_state=normal&p_p_mode=view&_eucleflegislationlist_WAR_euclefportlet_cur=1&_eucleflegislationlist_WAR_euclefportlet_substance_identifier_field_key=&_eucleflegislationlist_WAR_euclefportlet_delta=50&_eucleflegislationlist_WAR_euclefportlet_doSearch=&_eucleflegislationlist_WAR_euclefportlet_deltaParamValue=50&_eucleflegislationlist_WAR_euclefportlet_orderByCol=fld_erc2_maxthres&_eucleflegislationlist_WAR_euclefportlet_orderByType=desc')
# Optional: Wait for the page to load
time.sleep(5)
# Step 1: Click the "Accept Disclaimer" button
accept_button = driver.find_element(By.ID, "_viewsubstances_WAR_echarevsubstanceportlet_acceptDisclaimerButton")
accept_button.click()
# Optional: Wait for any page transition after accepting terms
time.sleep(3)
# Step 2: Find the CSV export button and click it to download
csv_button = driver.find_element(By.ID, "_euclegislationlist_WAR_euclefportlet_exportButtonCSV")
csv_button.click()
# Optional: Wait for the download to complete
time.sleep(10)
# Close the browser
driver.quit()
错误信息:
引发异常类(消息,屏幕,堆栈跟踪) selenium.common.exceptions.NoSuchElementException:消息:没有这样的元素:无法找到元素:{“method”:“css选择器”,“selector”:“[id =“_euclegislationlist_WAR_euclefportlet_exportButtonCSV”]“} (会话信息:chrome=129.0.6668.90);有关此错误的文档,请访问:https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#no-such-element-exception
你需要使用Selenium吗?由于表格位于 html 中,所以有点过头了。
import requests
import pandas as pd
page=0
dfs=[]
while True:
page+=1
url = 'https://www.echa.europa.eu/web/guest/cosmetics-prohibited-substances'
payload = {
'p_p_id': 'eucleflegislationlist_WAR_euclefportlet',
'p_p_lifecycle': '0',
'p_p_state': 'normal',
'p_p_mode': 'view',
'_eucleflegislationlist_WAR_euclefportlet_cur': '1',
'_eucleflegislationlist_WAR_euclefportlet_orderByCol': 'fld_erc2_maxthres',
'_eucleflegislationlist_WAR_euclefportlet_substance_identifier_field_key': '',
'_eucleflegislationlist_WAR_euclefportlet_orderByType': 'desc',
'_eucleflegislationlist_WAR_euclefportlet_doSearch': '',
'_eucleflegislationlist_WAR_euclefportlet_deltaParamValue': '50',
'_eucleflegislationlist_WAR_euclefportlet_resetCur': 'false',
'_eucleflegislationlist_WAR_euclefportlet_delta': '200',
'_eucleflegislationlist_WAR_euclefportlet_cur': f'{page}'}
response = requests.get(url, params=payload)
df = pd.read_html(response.text)[0]
if len(df) == 1:
break
dfs.append(df)
print(f'Page: {page}')
df = pd.concat(dfs)
df = df.dropna()
df.to_csv('output.csv', index=False)
输出:
Substance Name EC No. CAS No. Ref No. Product type, body parts Maximum Threshold Restriction(s) Unnamed: 7
0 Safrole 202-345-4 94-59-7 360.0 All cosmetic products except products for dental and oral hygiene 100 ppm Not permitted except for normal content in the natural essences used and provided the concentration does not exceed 100 ppm in the finished product Details
1 Safrole 202-345-4 94-59-7 360.0 Products for dental and oral hygiene 50 ppm Not permitted except for normal content in the natural essences used and provided the concentration does not exceed 50 ppm in the finished product and provided that is not present in toothpastes intended for children Details
2 Trioxysalen 223-459-0 3902-71-4 358.0 Sun protection products and bronzing products 1 mg/kg Not permitted for all products except if occurring in natural essences Details
3 Furocoumarines - - 358.0 Sun protection products and bronzing products 1 mg/kg Not permitted for all products except if occurring in natural essences Details
4 8-Methoxypsoralen 206-066-9 298-81-7 358.0 Sun protection products and bronzing products 1 mg/kg Not permitted for all products except if occurring in natural essences Details
....
[666 rows x 8 columns]