简体   繁体   English

使用python读取.dat文件

[英]Read .dat file using python

I've got a simulation involving internal waves and moving particles in the water, using the MITgcm . 我使用MITgcm进行了涉及内部波和水中运动粒子的模拟。 The output of this looks something like this for each time step: 每个时间步骤的输出如下所示:

   -9999 0.0000000000000000000  #Time step (0.00000000000000 seconds)
 1308.2021183321899       -14.999709364517091 # Particle 1 (X,Z)
 1308.2020142528656       -24.999521595698688 # Particle 2 (X,Z)
 1308.2018600072618       -34.999345597877536 # .
 1308.2016593336587       -44.999185870669805 # .
 1308.2014165588744       -54.999046508237896 # .
 1308.2011370083103       -64.998931076248894
 1308.2008269116873       -74.998842490305705
 1308.2004933548124       -84.998782925797485
 1308.2001441978532       -94.998753764086956
 1308.1997879652938       -104.99875557384759
 1308.1994336881464       -114.99878812280582
 1308.1990906721119       -124.99885041328211
 1308.1987681881285       -134.99894073461562
 1308.1984750963150       -144.99905672694641
 1308.1982194336249       -154.99919545294702
 1308.1980080134056       -164.99935347476733
 1308.1978461242272       -174.99952693694112
 1308.1977378137256       -184.99971163492469
 1308.2000000000000       -195.00000000000000
 5232.8000000000002       -15.000038916290352
 5232.8000000000002       -25.000064153684303
 5232.8000000000002       -35.000089286157163
 5232.8000000000002       -45.000114270293523
 5232.8000000000002       -55.000139061712051 # Particle 57

Where -9999 #number is the time step (in seconds), left column is X position and right column is Z position (in meters); 其中-9999 #number是时间步长(以秒为单位),左列为X位置,右列为Z位置(以米为单位); and every line is a different particle (except the -9999 one). 并且每一行都是不同的粒子(-9999除外)。 So we'll have an enormous amount of lines with something like this for every time step and every particle. 因此,对于每个时间步长和每个粒子,我们都会有大量这样的行。

I would like to plot the time-evolution of the position of my particles. 我想绘制粒子位置随时间的变化。 How can I do it? 我该怎么做? If that's too hard, I would be happy with static plots of different time-steps with all particles position. 如果太难了,我会对所有粒子位置的不同时间步长的静态图感到满意。

Thank you so much. 非常感谢。

Edit1: What I tried to do is this, but I didn't show it before because it is far from proper: Edit1:我试图做的是这个,但是我之前没有显示它,因为它远远不合适:

 from matplotlib import numpy
 import matplotlib.pyplot as plot
 plot.plot(*np.loadtxt('data.dat',unpack=True), linewidth=2.0)

or this: 或这个:

 plot.plotfile('data.dat', delimiter=' ', cols=(0, 1), names=('col1', 'col2'), marker='o')

I would use numpy.loadtxt for reading input, but only because post-processing would also need numpy. 我将使用numpy.loadtxt读取输入,但这仅是因为后期处理也需要numpy。 You can read all your data to memory, then find the separator lines, then reshape the rest of your data to fit your number of particles. 您可以将所有数据读取到内存中,然后找到分隔线,然后重新调整其余数据的形状以适合您的粒子数量。 The following assumes that none of the particles ever reach exactly x=-9999 , which should be a reasonable (although not foolproof) assumption. 以下假设没有一个粒子精确地达到x=-9999 ,这应该是一个合理的(尽管不是万无一失的)假设。

import numpy as np
filename = 'input.dat'
indata = np.loadtxt(filename, usecols=(0,1)) # make sure the rest is ignored
tlines_bool = indata[:,0]==-9999
Nparticles = np.diff(np.where(tlines_bool)[0][:2])[0] - 1
# TODO: error handling: diff(np.where(tlines_bool)) should be constant
times = indata[tlines_bool,1]
positions = indata[np.logical_not(tlines_bool),:].reshape(-1,Nparticles,2)

The above code produces an Nt -element array times and an array position of shape (Nt,Nparticles,2) for each particle's 2d position at each time step. 上面的代码为每个时间步长的每个粒子的2d位置生成Nt元素的数组times和形状为(Nt,Nparticles,2)的数组position By computing the number of particles, we can let numpy determine the size of the first dimension (this iswhat the -1 index in reshape() is for). 通过计算粒子的数量,我们可以让numpy确定第一维的大小(这就是reshape()-1索引所针对的)。

For plotting you just have to slice into your positions array to extract what you exactly need. 对于绘图,您只需将其切成positions数组即可提取出您真正需要的内容。 In case of 2d x data and 2d y data, matplotlib.pyplot.plot() will automatically try to plot the columns of the input arrays as a function of each other. 如果是2d x数据和2d y数据,则matplotlib.pyplot.plot()会自动尝试将输入数组的列作为彼此的函数进行绘制。 Here's an example of how you can visualize, using your actual input data: 这是一个如何使用实际输入数据进行可视化的示例:

import matplotlib.pyplot as plt
t_indices = slice(None,None,500)  # every 500th time step
particle_indices = slice(None)    # every particle
#particle_indices = slice(None,5)   # first 5 particles
#particle_indices = slice(-5,None)  # last 5 particles

plt.figure()
_ = plt.plot(times[myslice],positions[myslice,particle_indices,0])
plt.xlabel('t')
plt.ylabel('x')

plt.figure()
_ = plt.plot(times[myslice],positions[myslice,particle_indices,1])
plt.xlabel('t')
plt.ylabel('z')

plt.figure()
_ = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])
plt.xlabel('x')
plt.ylabel('z')
plt.show()

x(t)图 z(t)图 z(x)图

Each line corresponds to a single particle. 每条线对应一个粒子。 The first two plots show the time-evolution of the x and z components, respectively, and the third plot shows the z(x) trajectories. 前两个图分别显示了xz分量的时间演化,而第三个图显示了z(x)轨迹。 Note that there are a lot of particles in your data that don't move at all: 请注意,数据中有很多粒子根本不会移动:

>>> sum([~np.diff(positions[:,k,:],axis=0).any() for k in range(positions.shape[1])])
15

(This computes the time-oriented difference of both coordinates for each particle, one after the other, and counts the number of particles for which every difference in both dimensions is 0, ie the particle doesn't move.). (这将依次计算每个粒子的两个坐标的时间方向差异,并计算两个维度的每个差异均为0(即粒子不移动)的粒子数量。) This explains all those horizontal lines in the first two plots; 这解释了前两个图中的所有那些水平线。 these stationary particles don't show up at all in the third plot (since their trajectory is a single point). 这些静止的粒子在第三图中完全不显示(因为它们的轨迹是单点)。

I intentionally introduced a bit fancy indexing which makes it easier to play around with your data. 我故意引入了一些花哨的索引,这使您更轻松地处理数据。 As you can see, indexing looks like this: times[myslice] , positions[myslice,particle_indices,0] , where both slices are defined in terms of...well, a slice . 正如你所看到的,索引是这样的: times[myslice] positions[myslice,particle_indices,0]其中两片都在...好定义的,一个slice You should look at the documentation, but the short story is that arr[slice(from,to,stride)] is equivalent to arr[from:to:stride] , and if any of the variables is None , then the corresponding index is empty: arr[slice(-5,None)] is equivalent to arr[-5:] , ie it will slice the final 5 elements of the array. 您应该看一下文档,但简短的故事是arr[slice(from,to,stride)]等同于arr[from:to:stride] ,如果任何变量为None ,则对应的索引为空: arr[slice(-5,None)]等同于arr[-5:] ,即它将对数组的最后5个元素进行切片。

So, in case you use a reduced number of trajectories for plotting (since 57 is a lot), you might consider adding a legend (this only makes sense as long as the default color cycle of matplotlib lets you distinguish between particles, otherwise you have to either set manual colors or change the default color cycle of your axes). 因此,如果您使用较少数量的轨迹进行绘图(因为数量很多,则为57),则可以考虑添加图例(这仅在matplotlib的默认颜色周期可让您区分粒子之间有意义),否则设置手动颜色或更改轴的默认颜色周期)。 For this you will have to keep the handles that are returned from plot : 为此,您必须保留从plot返回的句柄:

particle_indices = slice(None,5)   # first 5 particles
plt.figure()
lines = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])
plt.xlabel('x')
plt.ylabel('z')
plt.legend(lines,['particle {}'.format(k) for k in range(len(t))])
plt.show()

图例示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM