将多个URL传递到一个dict中并提取数据。

问题描述 投票:0回答:1

我生成了以下代码,我通过API传递多个URLS,并要求有一个输出写到不同的pandas数据框。它(有点)工作,但结果是不正确的。

1)似乎进入函数后打印 "成功 "的次数太多。为什么会这样?

2)所有数据帧的输出都是一样的,不知道错误在哪里。

请看函数。

def data_extract(url):
payload = {'limit':'200000'}

# Persists parameters across requests
s = requests.Session()

# To determine success of request, and error code
for url in url:
    try:
        response = s.get(url)      
        # If the response was successful, no Exception will be raised 
        response.raise_for_status()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')
    except Exception as err:
        print(f'Other error occurred: {err}')
    else:
        # Ret
        jsonData = s.get(url, params=payload).json()
        print('Success!')

df_tmr = pd.DataFrame(jsonData['records'])
return df_tmr       

请看函数的调用:

    urls = {
    # Rainfall data
    'tot_rain_mth': 'https://data.gov.sg/dataset/5942f8bd-4240-4f68-acd2-a5a276958237/resource/778814b8-1b96-404b-9ac9-68d6c00e637b/data',
    'no_days_rain_mth': 'https://data.gov.sg/dataset/rainfall-monthly-number-of-rain-days/resource/8b94f596-91fd-4545-bf9e-7a426493b674/data',
    'max_rain_mth': 'https://data.gov.sg/dataset/rainfall-monthly-maximum-daily-total/resource/df4d391e-6950-4fc6-80cd-c9b9ef6354fe/data',
    # Temperature Data
    'mean_sun_dur_mth': 'https://data.gov.sg/dataset/sunshine-duration-monthly-mean-daily-duration/resource/0230819f-1c83-4980-b738-56136d6dc300/data',
    'wet_bulb_hr': 'https://data.gov.sg/dataset/wet-bulb-temperature-hourly/resource/0195dc7a-2f49-4107-ac7c-3112ca4a09a8/data',
    'min_air_temp_day': 'https://data.gov.sg/dataset/surface-air-temperature-mean-daily-minimum/resource/ad0d8a97-9321-42e9-ac6f-46bf12845d44/data',
    'min_air_temp_mth': 'https://data.gov.sg/dataset/surface-air-temperature-monthly-absolute-extreme-minimum/resource/0c5b9752-2488-46cc-ae1c-42318d0f8865/data',
    'mean_air_temp_mth': 'https://data.gov.sg/dataset/surface-air-temperature-monthly-mean/resource/07654ce7-f97f-49c9-81c6-bd41beba4e96/data',
    'max_air_temp_day': 'https://data.gov.sg/dataset/surface-air-temperature-mean-daily-maximum/resource/c7a7d2fd-9d32-4508-92ef-d1019e030a2f/data',
    'max_air_temp_mth': 'https://data.gov.sg/dataset/air-temperature-absolute-extremes-maximum/resource/96e66346-68bb-4ca9-b001-58bbf39e36a7/data',
    # Humidity Data
    'min_hum_mth': 'https://data.gov.sg/dataset/relative-humidity-monthly-absolute-extreme-minimum/resource/585c24a5-76cd-4c48-9341-9223de5adc1d/data',
    'mean_hum_mth': 'https://data.gov.sg/dataset/relative-humidity-monthly-mean/resource/4631174f-9858-463d-8a88-f3cb21588c67/data',
    'mean_hum_yr': 'https://data.gov.sg/dataset/relative-humidity-annual-mean/resource/77b9059f-cc9a-4f4f-a495-9c268945191b/data' 
}

df={}
for i in range(len(urls.keys())):
    df[str(i)] = pd.DataFrame()
#print('Name of Dataframe:', df)
    df[str(i)] = data_extract(urls.values())
print (df['0'])
print (df['1'])

--> 对不起,格式不对,在SO中不能完全正确。

python api loops dictionary url
1个回答
1
投票

import requests
import pandas as pd
def data_extract(url):
  print(url)
  payload = {'limit':'200000'}
  s = requests.Session()
  try:
      response = s.get(url)      
      response.raise_for_status()
      jsonData = s.get(url, params=payload).json()
      print('Success!')
  except Exception as err:
      print(f'Other error occurred: {err}')      

  df_tmr = pd.DataFrame(jsonData['records'])
  return df_tmr

urls = {
    # Rainfall data
    'tot_rain_mth': 'https://data.gov.sg/dataset/5942f8bd-4240-4f68-acd2-a5a276958237/resource/778814b8-1b96-404b-9ac9-68d6c00e637b/data',
    'no_days_rain_mth': 'https://data.gov.sg/dataset/rainfall-monthly-number-of-rain-days/resource/8b94f596-91fd-4545-bf9e-7a426493b674/data',
    'max_rain_mth': 'https://data.gov.sg/dataset/rainfall-monthly-maximum-daily-total/resource/df4d391e-6950-4fc6-80cd-c9b9ef6354fe/data',
    # Temperature Data
    'mean_sun_dur_mth': 'https://data.gov.sg/dataset/sunshine-duration-monthly-mean-daily-duration/resource/0230819f-1c83-4980-b738-56136d6dc300/data',
    'wet_bulb_hr': 'https://data.gov.sg/dataset/wet-bulb-temperature-hourly/resource/0195dc7a-2f49-4107-ac7c-3112ca4a09a8/data',
    'min_air_temp_day': 'https://data.gov.sg/dataset/surface-air-temperature-mean-daily-minimum/resource/ad0d8a97-9321-42e9-ac6f-46bf12845d44/data',
    'min_air_temp_mth': 'https://data.gov.sg/dataset/surface-air-temperature-monthly-absolute-extreme-minimum/resource/0c5b9752-2488-46cc-ae1c-42318d0f8865/data',
    'mean_air_temp_mth': 'https://data.gov.sg/dataset/surface-air-temperature-monthly-mean/resource/07654ce7-f97f-49c9-81c6-bd41beba4e96/data',
    'max_air_temp_day': 'https://data.gov.sg/dataset/surface-air-temperature-mean-daily-maximum/resource/c7a7d2fd-9d32-4508-92ef-d1019e030a2f/data',
    'max_air_temp_mth': 'https://data.gov.sg/dataset/air-temperature-absolute-extremes-maximum/resource/96e66346-68bb-4ca9-b001-58bbf39e36a7/data',
    # Humidity Data
    'min_hum_mth': 'https://data.gov.sg/dataset/relative-humidity-monthly-absolute-extreme-minimum/resource/585c24a5-76cd-4c48-9341-9223de5adc1d/data',
    'mean_hum_mth': 'https://data.gov.sg/dataset/relative-humidity-monthly-mean/resource/4631174f-9858-463d-8a88-f3cb21588c67/data',
    'mean_hum_yr': 'https://data.gov.sg/dataset/relative-humidity-annual-mean/resource/77b9059f-cc9a-4f4f-a495-9c268945191b/data' 
}

df={}
temp = list(urls.values())
for i in range(len(temp)):
    df[str(i)] = data_extract(temp[i])

print(df['0'])
print(df['1'])

if len(df) == len(temp):
  print('success')

我想这将帮助你。你在哪里迭代所有的项目,只返回最后一项,因为你。只是需要从data_extract方法中删除for循环。

© www.soinside.com 2019 - 2024. All rights reserved.