我正在尝试从该网站下载定制的XML文件:http://data.un.org/Data.aspx?d=CLINO&f=ElementCode:11;StatisticCode:01&c=1,2,5,17,18,44&s=CountryName:asc,WmoStationNumber:asc,StatisticCode:asc&v=1
我最熟悉的方法是使用pd.read_csv,但是在这种情况下,右键单击下载链接并复制链接地址会生成:
javascript:Download('xml','CLINO','ElementCode:11;StatisticCode:01','s=CountryName:asc,WmoStationNumber:asc,StatisticCode:asc','c=1,2,5,17,18,44','');
我尝试过将解决方案发布到here,但不幸的是,该过程在第4步中有所偏离。
使用python,如何访问.xml文件进行下载和保存?
import requests
import pandas as pd
params = {
'Service': 'page',
'DataFilter': 'ElementCode:11;StatisticCode:01',
'DataMartId': 'CLINO',
'UserQuery': '',
'c': '1,2,5,17,18,44'
}
def main(url, params):
with requests.Session() as req:
allin = []
for item in range(1, 23):
print(f"Extracting Page# {item}")
params['Page'] = item
r = req.get(url, params=params)
df = pd.read_html(r.content)[0]
allin.append(df)
new = pd.concat(allin)
print(new)
new.to_csv("Data.csv", index=False)
main("http://data.un.org/Handlers/DataHandler.ashx", params)