在 Python 中将 .sav 文件转换为 .csv 文件

问题描述 投票:0回答:5

我想在Python中将*.sav文件的内容转换为*.csv文件。我编写了以下代码行来访问 *.sav 文件中变量的详细信息。现在,我不清楚如何将访问的变量数据写入带有标题的 .csv 文件

import scipy.io as spio
on2file = 'ON2_2015_112m_220415.sav'
on2data = spio.readsav(on2file, python_dict=True, verbose=True)

以下是我运行上面几行代码时的结果:

IDL Save file is compressed
 -> expanding to /var/folders/z4/r3844ql123jgkq1ztdr4jxrm0000gn/T/tmpVE_Iz6.sav
--------------------------------------------------
Date: Mon Feb 15 20:41:02 2016
User: zhangy1
Host: augur
--------------------------------------------------
Format: 9
Architecture: x86_64
Operating System: linux
IDL Version: 7.0
--------------------------------------------------
Successfully read 11 records of which:
 - 7 are of type VARIABLE
 - 1 are of type TIMESTAMP
 - 1 are of type NOTICE
 - 1 are of type VERSION
--------------------------------------------------
Available variables:
 - saved_data [<class 'numpy.recarray'>]
 - on2_grid_smooth [<type 'numpy.ndarray'>]
 - d_lat [<type 'numpy.float32'>]
 - on2_grid [<type 'numpy.ndarray'>]
 - doy [<type 'str'>]
 - year [<type 'str'>]
 - d_lon [<type 'numpy.float32'>]
--------------------------------------------------

有人可以建议我如何将所有变量数据写入 .csv 文件吗?

我想将变量(year、doy、d_lon、d_lat、on2_grid、on2_grid_smooth)写入 CSV 或 ASCII 文件,应该按以下方式查看:

longitude, latitude, on2_grid, on2_grid_smooth   # header 
0.0,0.0,0.0,0.0              
0.0,0.0,0.0,0.0 
0.0,0.0,0.0,0.0 
0.0,0.0,0.0,0.0
..... 

“on2_grid”和“on2_grid_smooth”变量的形状相同,均为(101, 202)。两者都是“numpy.ndarray”类型。

python csv numpy scipy
5个回答
9
投票

首先安装

pyreadstat

pip install pyreadstat

然后,无论怎样,您都可以使用

pandas
将 SPSS 文件非常轻松地导入到 Python 中:

import pandas as pd
df = pd.read_spss("input_file.sav")

然后就可以用

.to_csv()
方法导出数据了:

df.to_csv("output_file.csv", index=False)

如果您只需要导出某些列,您也可以指定:

df[["column_a", "column_b"]].to_csv("output_file.csv", index=False)

1
投票

使用代码提取的文件中的纬度和经度列看起来是互换的。此外,纬度范围从 0 到 180(不是 +90 0 -90))...无论 0 是否从顶部开始。 PL。评论。


1
投票

我知道这个解决方案使用 R 而不是 python,但它非常简单并且效果很好。

library(foreign)
write.table(read.spss("inFile.sav"), file="outFile.csv", quote = TRUE, sep = ",")

0
投票

我可以通过更改必需的输出格式来解决我的问题,这是我的代码:

import scipy.io as spio
import numpy as np
import csv

on2file = 'ON2_2016_112m_220415.sav'   # i/p file
outfile = 'ON2_2016_112m_220415.csv'   # o/p file

# Read i/p file
s = spio.readsav(on2file, python_dict=True, verbose=True)

# Creating Grid
#d_lat = s["d_lat"]
#d_lon = s["d_lon"]
lat = np.arange(-90,90,1.78218)  # (101,)
lon = np.arange(-180,180,1.78218)     # (202,)
ylat,xlon = np.meshgrid(lat,lon)

on2grid = np.asarray(s["on2_grid"])
on2gridsmooth = np.asarray(s["on2_grid_smooth"])

nrows = len(on2grid)
ncols = len(on2grid[0])

xlon_grid = xlon.reshape(nrows*ncols,1)
ylat_grid = ylat.reshape(nrows*ncols,1)
on2grid_new = on2grid.reshape(nrows*ncols,1)
on2gridsmooth_new = on2gridsmooth.reshape(nrows*ncols,1)

# Concatenation
allgriddata = np.concatenate((xlon_grid, ylat_grid, on2grid_new, on2gridsmooth_new),axis=1)

# Writing o/p file
f_handle = file(outfile,'a')
np.savetxt(f_handle,allgriddata,delimiter=",",fmt='%0.3f',header="longitude, latitude, on2_grid, on2_grid_smooth")
f_handle.close()

0
投票

我正在研究它,目前,这是我的“糟糕”解决方案:

首先我导入模块 savReaderWriter 将 .sav 文件转换为结构化数组 其次,我导入模块 numpy 将结构化数组转换为 csv:

import savReaderWriter 
import numpy as np

reader_np = savReaderWriter.SavReaderNp("infile.sav")
array = reader_np.to_structured_array("outfile.dat") 
np.savetxt("outfile2.csv", array, delimiter=",")
reader_np.close()

问题是我在转换过程中丢失了名称属性。我会尽力解决问题。

© www.soinside.com 2019 - 2024. All rights reserved.