beautifulsoup 相关问题

import requests from bs4 import BeautifulSoup # string url str_url = 'https://www.utahrealestate.com/search/map.search' # get response response = requests.get(str_url) # get html soup = BeautifulSoup(response.text, 'html.parser') # get the number of listings and assign it to int_n_pages (I cant get this to work; it returns NoneType) int_n_pages = soup.find('li', {'class': 'view-results'}) # split and get n pages (this does not work because the previous line does not work) int_n_pages = int(int_n_pages.split(' ')[2])

python web-scraping beautifulsoup

回答 1 投票 0

使用 BeautifulSoup 搜索 HTML 中的字符串

我正在使用 BeautifulSoup 在特定页面上查找用户输入的字符串。例如，我想查看字符串“Python”是否位于页面：http://python.org 当我使用时：查找字符串 = ...

python beautifulsoup

回答 5 投票 0

抓取动态数据表，没有简单的参考

我正在尝试从以下网站的简单表格中获取数据（https://bvmf.bmfbovespa.com.br/clube-de-investimento/clube-de-investimento.aspx?Idioma=pt- br）。我能够获取数据...

python pandas web-scraping beautifulsoup python-requests

回答 2 投票 0

如何提取第一个h1标签之后的文本？

我正在尝试编写一段代码来每天从 100 个网站获取并清理文本。我遇到了一个问题，一个网站有多个 h1 标签，当您滚动到下一个 h1 标签时，URL ...

python html beautifulsoup content-management-system

回答 2 投票 0

请求和 BeautifulSoup 从 YouTube 获取视频长度

从 YouTube 网址获取视频长度时，从网络浏览器检查显示有一行：然后我使用 requests 和 BeautifulSoup 来获取它：导入请求从 bs4 导入 BeautifulSoup 网址=...

python web-scraping beautifulsoup request youtube

回答 3 投票 0

python 上的解析器返回一个空列表（我猜这是一个 HTML 类选择问题）

我的想法是：我想收集公寓的名称及其价格，作为网站上每套公寓的列表。我在 python 上做了一个简单的解析器，但看起来我无法获取任何值，因为它返回一个

python selenium-webdriver parsing beautifulsoup html-parsing

回答 1 投票 0

如何使用 Python 更新 Markdown 文件中 <img> 标签的 src 属性？

我有一个 Markdown 文件 README.md，其中包含 HTML 元素，例如带有属性 id 和 src 的标签。我想使用 Pyt 以编程方式更新此 HTML 元素的属性...

python beautifulsoup markdown

回答 1 投票 0

为什么要验证两次 `href` 属性？

我在 Ryan Mitchel 的 Web Scraping with Python 中找到了以下 Web 抓取代码：从 urllib.request 导入 urlopen 从 bs4 导入 BeautifulSoup 进口再页面=设置（） def getLinks(pageUr...

python web-scraping beautifulsoup urllib

回答 1 投票 0

如何使用beautifulsoup提取XML标签下的文本内容研究设计

我有一个如下所示的 XML 文件：研究设计这是对童年急性病的二次分析我有一个如下所示的 XML 文件： <sec id="sec2.1"> <title>Study design</title> This is a secondary analysis of the Childhood Acute Illness and Nutrition (CHAIN) Network prospective cohort which, between November 2016 and January 2019, recruited 3101 children at nine hospitals in Africa and South Asia: Dhaka and Matlab Hospitals (Bangladesh), Banfora Referral Hospital (Burkina Faso), Kilifi County, Mbagathi County and Migori County Hospitals (Kenya), Queen Elizabeth Hospital (Malawi), Civil Hospital (Pakistan), and Mulago National Referral Hospital (Uganda). As described in the published study protocol, <xref rid="bib11" ref-type="bibr"> 11 </xref> children were followed throughout hospital admission and after discharge with follow-up visits at 45, 90 and 180-days post-discharge. Catchment settings differed in urbanisation, access to health care and prevalence of background comorbidities such as HIV and malaria. Prior to study start, sites were audited to optimise care as per national and World Health Organisation (WHO) guidelines. <xref rid="bib12" ref-type="bibr"> 12 </xref> Cross-network harmonisation of clinical definitions and methods was prioritised through staff training and the use of standard operation procedures and case report forms (available online, <ext-link ext-link-type="uri" xlink:href="https://chainnetwork.org/resources/" id="intref0010">https://chainnetwork.org/resources/</ext-link> ). </sec> 如何使用 beatifulsoup 提取 元素中的文本？用下面的代码解决这个问题似乎不起作用。 with open('test.xml', 'r') as file: soup = BeautifulSoup(file, 'xml') # Find and print all tags for tag in soup.find_all('sec'): print(tag.text) 非常感谢您的评论和帮助您必须更具体地选择您的元素。基于您的代码链.p到您的标签，始终获得所选中的第一个<sec> for tag in soup.find_all('sec'): print(tag.p.get_text(strip=True))

xml parsing beautifulsoup xml-parsing

回答 1 投票 0

如何使用Beautifulsoup从内部网页中提取链接tel的值？

我正在尝试编写一个脚本，可以收集有关手机的信息并将其添加到数据框中。我有这样一个带有客户 ID 的数据集。同时，电话号码存储在网络中

pandas beautifulsoup

回答 1 投票 0

Python beautifulsoup 和 openpyxl

所以，我正在尝试使用 beautifulsoup 进行数据提取（网络爬虫/抓取器），并且我正在尝试迭代 html 中的每个标签以查找我想要的数据。我的目标是获得一个具体的

python excel beautifulsoup openpyxl

回答 2 投票 0

如何从亚马逊网站上抓取特定产品的客户评论总数？

我正在尝试使用 python 和漂亮的汤库从亚马逊页面上抓取客户评论的总数。诸如客户评级、星级之类的细节正在被删除，但不是 r...