使用 Python 3.9,如何从 URL -> https://www.tamoil.ch/en/store-locator 获取 MS Excel 中的所有物理地址

问题描述 投票:0回答:1

我想从此网址 [https://www.tamoil.ch/en/store-locator] 获取 MS-excel 中的 所有物理地址。 电子表格只有标题,但没有代码输出。

import requests
from bs4 import BeautifulSoup
import pandas as pd

# Send a GET request to the website
url = "https://www.tamoil.ch/en/store-locator"

response = requests.get(url)

# Parse the HTML content
soup = BeautifulSoup(response.content, 'html.parser')

# Find all elements containing store information
store_elements = soup.find_all('div', class_='store-element')

# Extract addresses
addresses = []
for store in store_elements:
    address = store.find('p', class_='address').text.strip()
    addresses.append(address)

# Create a DataFrame
df = pd.DataFrame({'Address': addresses})

# Save to Excel file
excel_file = 'tamoil_addresses.xlsx'
df.to_excel(excel_file, index=False)

print(f"Addresses saved to {excel_file}")
python html pandas web-scraping beautifulsoup
1个回答
0
投票

您可以检查一下这是否是您要找的东西吗?

import requests
import pandas as pd
from requests.packages.urllib3.exceptions import InsecureRequestWarning


requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

url = "https://www.tamoil.ch/page-data/en/store-locator/page-data.json" #address and location in JSON format

resp = requests.get(url, verify=False)
data = resp.json()['result']['data']['allStation']['nodes']

slug_r = []
name_r = []
lat_r = []
lng_r = []
street_r = []
zip_code_r = []
city_r = []

for i in data:
    slug = i['slug']
    name = i['name']
    lat = i['lat']
    lng = i['lng']
    street = i['street']
    zip_code = i['zip']
    city = i['city']['name']

    #append 
    slug_r.append(slug)
    name_r.append(name)
    lat_r.append(lat)
    lng_r.append(lng)
    street_r.append(street)
    zip_code_r.append(zip_code)
    city_r.append(city)
    print(f"-----------------------\nSLUG:   {slug}\nNAME:   {name}\nLATITUDE:   {lat}\nLONITUDE:   {lng}\nSTREET ADDRESS:   {street}\nZIP CODE:   {zip_code}\nCITY:   {city}")

data_dict = {
    'SLUG': slug_r,
    'NAME': name_r,
    'LATITUDE': lat_r,
    'LONGITUDE': lng_r,
    'STREET': street_r,
    'ZIP CODE': zip_code_r,
    'CITY': city_r
}
df = pd.DataFrame(data_dict)
df.to_excel('Address.xlsx', index=False)

我使用他们的

page-data
(JSON) 端点来获取所有这些地址。

如果我根据您的问题错过了某些内容,请告诉我

© www.soinside.com 2019 - 2024. All rights reserved.