使用Python(Selenium + BeautifulSoup)从交互式图表中提取数据

问题描述 投票:0回答:1

我需要从此链接中的资产演变图表中提取数据(示例):https://investidor10.com.br/carteira/572422/(附有图表图像)。我需要图表中所有条形图的数据:资产价值、资本收益和投资金额。我尝试使用 Selenium + BeautifulSoup 进行提取,但我不能,因为数据不存在于 HTML 中,并且仅当您单击图表的条形时才会出现。我在互联网上搜索但找不到任何可以帮助我解决这个问题的东西。

总之,有谁知道如何提取资产演变图表中出现的数据?它不一定需要使用 Selenium + BeautifulSoup,但它需要使用 Python。

我要从中提取数据的图表

我尝试使用 Selenium + BeautifulSoup,但我不知道如何提取数据,因为它是动态的,并且当您在图表上选择一个条形时出现

python selenium-webdriver web-scraping beautifulsoup web-crawler
1个回答
0
投票

不需要使用

selenium
beautifulsoup
,在我看来,最简单/最直接的方法是使用从中提取数据的API。

在这种情况下如何知道内容是否动态加载/渲染?

第一个指标,在浏览器中以人类身份调用网站,并注意到该区域出现加载动画/延迟。第二个指标,该内容不包含在对请求的静态响应中。现在,您可以使用浏览器的开发人员工具查看“XHR 请求”选项卡,以了解正在从哪些资源加载哪些数据。 -> http://developer.chrome.com/docs/devtools/network

如果有 api 使用它,否则使用

selenium

示例
import requests

requests.get(
    'https://investidor10.com.br/api/carteiras/charts/evolucao-patrimonio/572422/12/all',
    headers={'user-agent':'some_valid_agent'}
).json()
结果
[{"month":"05\/23","date":"05\/23","sum_applied":3599.2496,"sum_equity":3794.7088999999996,"sum_flow":5198.9427,"profitability":0},{"month":"06\/23","date":"06\/23","sum_applied":4199.3396,"sum_equity":4621.3407,"sum_flow":6038.5586,"profitability":12.73},{"month":"07\/23","date":"07\/23","sum_applied":5163.1996,"sum_equity":5585.579299999999,"sum_flow":7031.7742,"profitability":10.32},{"month":"08\/23","date":"08\/23","sum_applied":7282.6224,"sum_equity":7601.287600000001,"sum_flow":9065.3382,"profitability":5.83},{"month":"09\/23","date":"09\/23","sum_applied":8304.412400000001,"sum_equity":8625.8636,"sum_flow":10053.2882,"profitability":5.14},{"month":"10\/23","date":"10\/23","sum_applied":8845.77940001,"sum_equity":8980.3758,"sum_flow":10872.9838,"profitability":2.68},{"month":"11\/23","date":"11\/23","sum_applied":10658.68980001,"sum_equity":11171.1518,"sum_flow":13193.6327,"profitability":5.8},{"month":"12\/23","date":"12\/23","sum_applied":12046.64240001,"sum_equity":13070.843799999999,"sum_flow":15134.4752,"profitability":9.41},{"month":"01\/24","date":"01\/24","sum_applied":13077.23640001,"sum_equity":13844.4296,"sum_flow":15645.7756,"profitability":6.68},{"month":"02\/24","date":"02\/24","sum_applied":14686.01640001,"sum_equity":15452.7688,"sum_flow":17294.9096,"profitability":5.94},{"month":"03\/24","date":"03\/24","sum_applied":16045.49640001,"sum_equity":16943.5274,"sum_flow":18794.0035,"profitability":6.26},{"month":"04\/24","date":"04\/24","sum_applied":17719.15640001,"sum_equity":17760.8627,"sum_flow":20053.741,"profitability":0.8},{"month":"05\/24","date":"05\/24","sum_applied":21831.56640001,"sum_equity":22332.2705,"sum_flow":24650.6796,"profitability":2.76}]
© www.soinside.com 2019 - 2024. All rights reserved.