如何使用 BeautifulSoup 找到 div class_=css-gz8dae 的“描述”?

问题描述 投票:0回答:1

我是 Python 的新手,我正在学习它是为了抓取目的我正在使用 BeautifulSoup 从工作机会中收集描述 在另一个提供工作机会并使用相同代码但具有不同 div 类的站点中,我可以找到我想要的东西。我为 justjoin.it 写了这段代码 import requests from bs4 import BeautifulSoup link="https://justjoin.it/offers/jungle-devops-engineer" response_IDs=requests.get(link) soup=BeautifulSoup(response_IDs.text, 'html.parser') Search_part = soup.find(id='root') description= Search_part.find_all('div', class_='css-gz8dae') for i in description: print(i)

帮我写一个正确的代码

python css web-scraping beautifulsoup css-selectors
1个回答
0
投票

首先,确保你已经安装了 Selenium:

pip install selenium

对于 google colab,请在 
!

的前面添加一个

pip install
(见下文)。
正如我提到的,我在使用 FireFox 的 google colab 上运行我所有的 python。这对我有用:

import requests from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.firefox.options import Options link = "https://justjoin.it/offers/jungle-devops-engineer" # Set up headless browser (no GUI) options = Options() options.headless = True browser = webdriver.Firefox(options=options) # Use Selenium to get the page source after JavaScript has executed browser.get(link) page_source = browser.page_source browser.quit() # Use BeautifulSoup to parse the HTML soup = BeautifulSoup(page_source, 'html.parser') description = soup.find_all('div', class_='css-gz8dae') for i in description: print(i.text)

这是输出:

Running a flexible Machine Learning engine at scale is hard. We must ingest and process large volumes of data uninterruptedly and store it in a scalable manner. The data needs to be prepared and served to hundreds of models constantly. All the predictions of the models, as well as other data pipelines, ...

如果您使用 chrome,请更改此行

browser = webdriver.Firefox(options=options)

这个:

browser = webdriver.Chrome(options=options)

要在 google colab 上运行整个程序,您需要先像这样安装 selenium 和 firefox:

!pip install selenium !apt-get update !apt install -y firefox !apt install -y wget !apt install -y unzip

然后,您还需要 GeckoDriver,它应该在系统的 PATH 中设置:

!wget https://github.com/mozilla/geckodriver/releases/download/v0.30.0/geckodriver-v0.30.0-linux64.tar.gz !tar -xvf geckodriver-v0.30.0-linux64.tar.gz !chmod +x geckodriver !mv geckodriver /usr/local/bin/

在这些安装之后运行上面的代码。

© www.soinside.com 2019 - 2024. All rights reserved.