[英]Convert a .sav file to .csv file in Python
I want to convert the contents of *.sav file into a *.csv file in Python. I have written the following lines of code to access the details of variables in *.sav file.我想将 *.sav 文件的内容转换为 Python 中的 *.csv 文件。我编写了以下代码行来访问 *.sav 文件中变量的详细信息。 Now, I am not clear on how I can write the accessed variable data to a.csv file with headers
现在,我不清楚如何将访问的变量数据写入带有标题的 .csv 文件
import scipy.io as spio
on2file = 'ON2_2015_112m_220415.sav'
on2data = spio.readsav(on2file, python_dict=True, verbose=True)
Following is the result when I run the above lines of the code:以下是我运行上面几行代码时的结果:
IDL Save file is compressed
-> expanding to /var/folders/z4/r3844ql123jgkq1ztdr4jxrm0000gn/T/tmpVE_Iz6.sav
--------------------------------------------------
Date: Mon Feb 15 20:41:02 2016
User: zhangy1
Host: augur
--------------------------------------------------
Format: 9
Architecture: x86_64
Operating System: linux
IDL Version: 7.0
--------------------------------------------------
Successfully read 11 records of which:
- 7 are of type VARIABLE
- 1 are of type TIMESTAMP
- 1 are of type NOTICE
- 1 are of type VERSION
--------------------------------------------------
Available variables:
- saved_data [<class 'numpy.recarray'>]
- on2_grid_smooth [<type 'numpy.ndarray'>]
- d_lat [<type 'numpy.float32'>]
- on2_grid [<type 'numpy.ndarray'>]
- doy [<type 'str'>]
- year [<type 'str'>]
- d_lon [<type 'numpy.float32'>]
--------------------------------------------------
Can anyone suggest me with how I can write all the variable data to a.csv file?谁能建议我如何将所有可变数据写入 a.csv 文件?
I want to write the variables (year, doy, d_lon, d_lat, on2_grid, on2_grid_smooth) to a CSV or ASCII file is supposed to look in the following manner:我想将变量(year、doy、d_lon、d_lat、on2_grid、on2_grid_smooth)写入 CSV 或 ASCII 文件应该以下列方式查看:
longitude, latitude, on2_grid, on2_grid_smooth # header
0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0
0.0,0.0,0.0,0.0
.....
The shape of "on2_grid" and "on2_grid_smooth" variables is the same and is (101, 202). “on2_grid”和“on2_grid_smooth”变量的形状相同,均为 (101, 202)。 Both are of the type "numpy.ndarray".
两者都是“numpy.ndarray”类型。
For what it's worth, you can import SPSS files very easily into Python using pandas
:对于它的价值,您可以使用
pandas
非常轻松地将 SPSS 文件导入到 Python 中:
import pandas as pd
df = pd.read_spss("input_file.sav")
And then you can export the data with the .to_csv()
method:然后您可以使用
.to_csv()
方法导出数据:
df.to_csv("output_file.csv", index=False)
If you only need to export certain columns, you can specify that too:如果您只需要导出某些列,您也可以指定:
df[["column_a", "column_b"]].to_csv("output_file.csv", index=False)
The column of latitude and longitude in the extracted files using your code looks interchanged. 使用您的代码提取的文件中的纬度和经度列看起来是互换的。 Further the latitude scale ranges from 0 to 180 (not +90 0 -90)) ...whether the 0 starts from the top.
此外,纬度范围从0到180(不是+90 0 -90))... 0是否从顶部开始。 Pl.
Pl。 comment.
评论。
I know that this solution uses R instead of python, but it is really simple and works well. 我知道此解决方案使用R而不是python,但是它确实很简单并且效果很好。
library(foreign)
write.table(read.spss("inFile.sav"), file="outFile.csv", quote = TRUE, sep = ",")
I could solve my problem by changing the requisite output format and here is my code: 我可以通过更改必需的输出格式来解决问题,这是我的代码:
import scipy.io as spio
import numpy as np
import csv
on2file = 'ON2_2016_112m_220415.sav' # i/p file
outfile = 'ON2_2016_112m_220415.csv' # o/p file
# Read i/p file
s = spio.readsav(on2file, python_dict=True, verbose=True)
# Creating Grid
#d_lat = s["d_lat"]
#d_lon = s["d_lon"]
lat = np.arange(-90,90,1.78218) # (101,)
lon = np.arange(-180,180,1.78218) # (202,)
ylat,xlon = np.meshgrid(lat,lon)
on2grid = np.asarray(s["on2_grid"])
on2gridsmooth = np.asarray(s["on2_grid_smooth"])
nrows = len(on2grid)
ncols = len(on2grid[0])
xlon_grid = xlon.reshape(nrows*ncols,1)
ylat_grid = ylat.reshape(nrows*ncols,1)
on2grid_new = on2grid.reshape(nrows*ncols,1)
on2gridsmooth_new = on2gridsmooth.reshape(nrows*ncols,1)
# Concatenation
allgriddata = np.concatenate((xlon_grid, ylat_grid, on2grid_new, on2gridsmooth_new),axis=1)
# Writing o/p file
f_handle = file(outfile,'a')
np.savetxt(f_handle,allgriddata,delimiter=",",fmt='%0.3f',header="longitude, latitude, on2_grid, on2_grid_smooth")
f_handle.close()
I am working on it and, for the moment, this is my 'poor' solution: 我正在努力,目前,这是我的“较差”解决方案:
First I import module savReaderWriter to convert .sav file into structured array Second I import module numpy to convert structured array into csv: 首先,我导入模块savReaderWriter来将.sav文件转换为结构化数组。其次,我导入模块numpy来将结构化数组转换为csv:
import savReaderWriter
import numpy as np
reader_np = savReaderWriter.SavReaderNp("infile.sav")
array = reader_np.to_structured_array("outfile.dat")
np.savetxt("outfile2.csv", array, delimiter=",")
reader_np.close()
The problem is that I lose name atributes during conversion. 问题是我在转换过程中丢失了姓名属性。 I will try to solve the problem.
我将尝试解决问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.