Web报废，带有漂亮的汤多个重复标签

Question

这是我第一次进行网页抓取，并且正在关注此tutorial。我正在使用此website抓取信息。我正在尝试获取文字为“ 89426 Green Mountain Road，Astoria，OR97103。电话：503-325-9720”。我注意到我的ul标签中有多个li和div class_=alert标签。因此，我不确定如何抓住特定的一个。这是我尝试过的方法，但继续从另一组ul / li中获得不同的文本。

from bs4 import BeautifulSoup
import requests

source = requests.get('https://www.pickyourownchristmastree.org/ORxmasnw.php').text

soup = BeautifulSoup(source, 'lxml')

noble_ridge = soup.find('div', class_='alert')
information = noble_ridge.ul.li.text
print(information)
# print(soup.prettify())


C:\Users\name\anaconda3\envs\Scraping\python.exe C:/Users/name/PycharmProjects/Scraping/Christmas_tree_farms.py
If the name of the farm is blue with an underline; that's a link to their website. Click on it for the most current hours and information.

Process finished with exit code 0

Answer 1

import requests
from bs4 import BeautifulSoup


def main(url):
    r = requests.get(url)
    soup = BeautifulSoup(r.content, 'html.parser')
    target = soup.select_one("span.farm")
    goal = list(target.next_elements)[5].rsplit(" ", 2)[0]
    print(goal)


main("https://www.pickyourownchristmastree.org/ORxmasnw.php")

输出：

89426 Green Mountain Road, Astoria, OR 97103. Phone: 503-325-9720.

Web报废，带有漂亮的汤多个重复标签

问题描述投票：0回答：1

1个回答

最新问题

Web报废，带有漂亮的汤多个重复标签

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1