[英]Reading MatLab files in python w/ scipy
I'm using python w/ scipy package to read the MatLab file. 我正在使用带w / scipy包的python来读取MatLab文件。
However it takes too long and crashes. 但是,它花费的时间太长并崩溃。
The Dataset is about 50~ MB in size 数据集的大小约为50〜MB
Is there any better way to read the data and form an edge list ? 有没有更好的方法来读取数据并形成边缘列表?
My python code 我的python代码
import scipy.io as io
data=io.loadmat('realitymining.mat')
print data
You could just save each field of the struct in a different text file, eg: 您可以只将结构的每个字段保存在不同的文本文件中,例如:
save('friends.txt', '-struct', 'network', 'friends', '-ascii')
and load each file separately from python 并分别从python加载每个文件
friends = numpy.loadtxt('friends.txt')
which loads instantly. 立即加载。
I can load it after unzipping. 解压缩后即可加载。 But it is stretching the memory. 但是它正在扩展记忆。
When I try to load it with octave
I get: 当我尝试用octave
加载它时,我得到:
octave:1> load realitymining.mat
error: memory exhausted or requested size too large for range of Octave's index type -- trying to return to prompt
In Ipython 在Ipython中
In [10]: data.keys()
Out[10]: ['network', 's', '__version__', '__header__', '__globals__']
In [14]: data['__header__']
Out[14]: 'MATLAB 5.0 MAT-file, Platform: MACI, Created on: Tue Sep 29 20:13:23 2009'
In [15]: data['s'].shape
Out[15]: (1, 106)
In [17]: data['s'].dtype
Out[17]: dtype([('comm', 'O'), ('charge', 'O'), ('active', 'O'), ('logtimes', 'O'),...
('my_intros', 'O'), ('home_nights', 'O'), ('comm_local', 'O'), ('data_mat', 'O')])
# 58 fields
In [24]: data['s']['comm'][0,1].shape
Out[24]: (1, 30)
In [31]: data['s']['comm'][0,1][0,1]
Out[31]: ([[732338.8737731482]], [[355]], [[-1]], [u'Packet Data'], [u'Outgoing'],
[[40]], [[nan]])
In [33]: data['s']['comm'][0,1]['date']
Out[33]:
array([[array([[ 732338.86915509]]), array([[ 732338.87377315]]),
...
array([[ 732340.48579861]]), array([[ 732340.52778935]])]], dtype=object)
Look at the pieces. 看碎片。 Simply trying to print data
or print data['s']
takes too long. 仅尝试print data
或print data['s']
会花费太长时间。 Apparently it is just too big of structure to format quickly. 显然,它的结构太大而无法快速格式化。
To practically get at this data, I'd suggest loading it once in Python or Matlab, and then save the useful pieces to one or more files. 为了实际获得这些数据,我建议在Python或Matlab中将其加载一次,然后将有用的片段保存到一个或多个文件中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.