[英]Splitting data file columns into separate arrays in Python
I'm new to python and have been trying to figure this out all day. 我是python的新手,并且一直试图解决这个问题。 I have a data file laid out as below,
我有一个数据文件,如下所示,
time I(R_stkb)
Step Information: Temp=0 (Run: 1/11)
0.000000000000000e+000 0.000000e+000
9.999999960041972e-012 8.924141e-012
1.999999992008394e-011 9.623148e-012
3.999999984016789e-011 6.154220e-012
(Note: No empty line between the each data line.) (注意:每条数据线之间没有空行。)
I want to plot the data using matplotlib functions, so I'll need the two separate columns in arrays. 我想使用matplotlib函数绘制数据,所以我需要在数组中使用两个单独的列。
I currently have 我现在有
def plotdata():
Xvals=[], Yvals=[]
i = open(file,'r')
for line in i:
Xvals,Yvals = line.split(' ', 1)
print Xvals,Yvals
But obviously its completely wrong. 但显然它完全错了。 Can anyone give me a simple answer to this, and with an explanation of what exactly the lines mean would be helpful.
任何人都可以给我一个简单的答案,并解释这些线的确切含义会有所帮助。 Cheers.
干杯。
Edit: The first two lines repeat throughout the file. 编辑:前两行在整个文件中重复。
This is a job for the *
operator on the zip
method. 这是
zip
方法的*
运算符的作业。
>>> asdf
[[1, 2], [3, 4], [5, 6]]
>>> zip(*asdf)
[(1, 3, 5), (2, 4, 6)]
So in the context of your data it might be something like: 因此,在您的数据的上下文中,它可能是这样的:
handle = open(file,'r')
lines = [line.split() for line in handle if line[:4] not in ('time', 'Step')]
Xvals, Yvals = zip(*lines)
or if your really need to be able to mutate the data afterwards you could just call the list
constructor on each tuple: 或者如果你真的需要能够在之后改变数据,你可以在每个元组上调用
list
构造函数:
Xvals, Yvals = [list(block) for block in zip(*lines)]
One way to do it is: 一种方法是:
Xvals=[]; Yvals=[]
i = open(file,'r')
for line in i:
x, y = line.split(' ', 1)
Xvals.append(float(x))
Yvals.append(float(y))
print Xvals,Yvals
Note the call to the float
function, which will change the string you get from the file into a number. 请注意对
float
函数的调用,它会将您从文件中获得的字符串更改为数字。
This is what numpy.loadtxt
is designed for. 这就是
numpy.loadtxt
的设计目标。 Try: 尝试:
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt(file, skiprows = 2) # assuming you have time and step information on 2 separate lines
# and you do not want to read them
plt.plot(data[:,0], data[:,1])
plt.show()
EDIT: if you have time and step information scattered throughout the file and you want to plot data on every step, there is a possibility of reading all the file to memory (suppose it's small enough), and then split it on time
strings: 编辑:如果你有时间和步骤信息分散在整个文件中,并且你想在每一步上绘制数据,有可能将所有文件读取到内存(假设它足够小),然后将其拆分为
time
字符串:
l = open(fname, 'rb').read()
for chunk in l.split('time'):
data = np.array([s.split() for s in chunk.split('\n')[2:]][:-1], dtype = np.float)
plt.plot(data[:,0], data[:,1])
plt.show()
Or else you could add the #
comment sign to the comment lines and use np.loadxt
. 否则你可以在添加
#
注释符号的注释行,并使用np.loadxt
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.