我有一段Python代码,用于将一段英文文本输入到网站中(https://edu.visl.dk/visl/en/parsing/automatic/trees.php),并提取语法树从中汲取。到目前为止,我一直在硒中这样做:
options = webdriver.ChromeOptions()
options.add_argument('--headless')
service = Service(executable_path="c:\Program Files (x86)\chromedriver.exe")
driver = webdriver.Chrome(service=service, options=options)
driver.get("https://edu.visl.dk/visl/en/parsing/automatic/trees.php")
form = driver.find_element(By.NAME, "theform")
dropdown = form.find_element(By.NAME, "visual")
drop = Select(dropdown)
drop.select_by_visible_text("Vertical")
search = form.find_element(By.NAME, "text")
search.send_keys("John killed the cat with a hammer.")
submit = form.find_element(By.TAG_NAME, "input")
submit.click()
results = driver.find_element(By.TAG_NAME, "pre")
soup = str(bs(results.get_attribute("innerHTML"), "html.parser"))
但是,我需要一遍又一遍(循环)执行此操作,这对于我的目的来说太慢了。有没有更快的方法来做到这一点?
使用
requests
代替:
import requests
from bs4 import BeautifulSoup
payload = {
'text': 'John killed the cat with a hammer.',
'export': 'Export and Download',
'parser': 'tree',
'visual': 'vertical',
'symbol': 'default',
}
url = 'https://edu.visl.dk/visl/en/parsing/automatic/trees.php'
response = requests.post(url, data=payload)
soup = BeautifulSoup(response.text, 'html.parser')
result = soup.body.pre.get_text(strip=True)
print(result)