抓取带有嵌入式 Google 地图的网站

问题描述 投票:0回答:1

我正在查看这个网站:https://silkroadmed.com/hospitals/

有没有办法刮掉谷歌地图上的红色图钉?当您单击图钉时,您会获得姓名、地址和电话号码。理想情况下,我希望能够创建位置列表。

有没有办法使用 API 来做到这一点?当我检查网络请求时,我没有看到一个。

或者是使用selenium输入各种邮政编码并手动一一抓取文本的唯一选择?

python selenium-webdriver web-scraping
1个回答
0
投票

试试这个:

import json
import requests
import re

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0',
    'Accept': '*/*',
    'Accept-Language': 'en-US,en;q=0.5',
    'Connection': 'keep-alive',
    'Referer': 'https://silkroadmed.com/',
    'Sec-Fetch-Dest': 'script',
    'Sec-Fetch-Mode': 'no-cors',
    'Sec-Fetch-Site': 'cross-site',
}

params = {
    'callback': 'slw',
    '_': '1716699895443',
}

response = requests.get(
    'https://cdn.storelocatorwidgets.com/json/AJ587baQBwcvO8ZBAoQKFx0r5DhPhcub',
    params=params,
    headers=headers,
)

data = response.text

# The JSON is wrapped in "slw(...)".
#
data = re.sub("^slw\(|\)$", "", data)

data = json.loads(data)

with open("hospital-locations.json", "wt") as file:
    json.dump(data, file, indent=2)

with open("hospital-locations.csv", "wt") as file:
    file.write("name,lat,lon\n")
    for location in data["stores"]:
        file.write('"%s", %f, %f\n' % (
            location["name"],
            location["data"]["map_lat"],
            location["data"]["map_lng"]
            )
        )

生成的数据如下所示:

name,lat,lon
"Aashish P Gupta, MD", 28.087052, -80.613514
"Abdallah Naddaf, MD", 40.866809, -79.880958
"Abindra Sigdel, MD", 38.248064, -85.751044
"Adam D. Levitt, MD", 28.525766, -81.377504
"Adam Keefer, MD", 32.782840, -79.949424
"Adam Ring, MD", 37.489188, -122.224276
"Adel Barkat, MD", 36.147114, -95.967645
"Adnan Rizvi, MD", 47.648101, -117.413493
"Afshin Skibba, MD", 35.936479, -84.010499
"Ahmad Hussain, M.D.", 34.088233, -117.893752

以下是使用

{leaflet}
从 R 绘制的一些抓取位置。

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.