296 / 5.000 你好,我在使用 Django 和 BeautifulSopup 进行抓取时遇到问题,在代码中一切似乎都很好,我将其指向开放网络上呈现的动态 GeoServer 地图,我对其进行处理是为了 div 的名称,但它没有给我带来我需要的东西,有人可以帮助我吗?谢谢!
#utils.py
import requests
from bs4 import BeautifulSoup
url = 'https://sgainacirsa.ddns.net/cirsa'
response = requests.get(url)
if response.status_code == 200:
soup = BeautifulSoup(response.content, 'html.parser')
map_div = soup.find('div', class_='map')
if map_div:
print(map_div)
else:
print("No se encontró el div con la clase 'map'.")
else:
print(f"Error al acceder a la página: {response.status_code}")
#views.py
from django.shortcuts import render
import requests
from bs4 import BeautifulSoup
def index (request):
return render (request,"index.html")
def scrape_(request):
url = 'https://sgainacirsa.ddns.net/cirsa'
try:
response = requests.get(url)
response.raise_for_status()
except requests.exceptions.RequestException as e:
return render(request, 'scrape.html', {'error': f"Error al realizar la solicitud: {e}"})
if response.status_code == 200:
soup = BeautifulSoup(response.content, 'html.parser')
map_div = soup.find('div', class_='map')
map_content = str(map_div) if map_div else "No se encontró el div con la clase 'map'."
else:
map_content = f"Error al acceder a la página: {response.status_code}"
return render(request, 'scrape.html', {'map_content': map_content})
#HTML Out
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Web Scraping con Django</title>
</head>
<body>
<h1>Contenido del div con la clase 'map'</h1>
<div>
{{ map_content|safe }}
</div>
{% if error %}
<div>{{ error }}</div>
{% endif %}
</body>
</html>
如果我理解正确的话,最好使用:
soup.find(“div”, id=“map”)
我在这里看到https://sgainacirsa.ddns.net/cirsa/
id=map
有明确的定义。
但在你的情况下它不会有帮助,因为这个
div
稍后会出现,在JS启动之后。你应该使用像 selenium 这样的东西,它会在你获取数据进行抓取之前运行 JS。