使用Beautiful Soup来统计标题/链接

问题描述 投票:0回答:1

我正在尝试编写一个代码来跟踪this网页上左手灰色框中的链接文本。在这种情况下,代码应该返回

瓦雷克里
酸宝宝

这是我尝试使用的代码:

import requests
from bs4 import BeautifulSoup

url = 'https://www.mountainproject.com/area/109928429/aasgard-sentinel'
page = requests.get(url)
soup = BeautifulSoup(page.text, "html.parser")

for link in soup.findAll('a', class_= 'new-indicator'):
    print(link)

它不起作用(否则我就不会在这里!)我对 BeautifulSoup 和一般编码都很陌生。无论我如何检查页面源代码,我似乎都无法弄清楚 findAll 的输入以使其返回我想要的内容!

web-scraping beautifulsoup
1个回答
0
投票

所有值都存储在

table
标签中,其中
id="left-nav-route-table"
href
形式。

桌子照片

示例代码:

import requests
from bs4 import BeautifulSoup

url = "https://www.mountainproject.com/area/109928429/aasgard-sentinel"

resp = requests.get(url).text
soup = BeautifulSoup(resp,'lxml')
data = soup.find('table', id="left-nav-route-table") #locate the table with the help of id
#this table has two href values each containing your desired output as text form so I used .text.strip() func, but they're in unserialized format so I printed [1] value first and [0] second to match Your condition
print(f"{data.findAll('a', href=True)[1].text.strip()}\n{data.findAll('a', href=True)[0].text.strip()}")

输出:

Valkyrie, The
Acid Baby
© www.soinside.com 2019 - 2024. All rights reserved.