简体   繁体   English

从 Python 中的日志文件读取数据

[英]Reading data from log file in Python

Im trying to read data from a log file I have in Python.我试图从 Python 中的日志文件中读取数据。 Suppose the file is called data.log.假设该文件名为 data.log。 The content of the file looks as follows:该文件的内容如下所示:

# Performance log
# time, ff, T vector, dist, windnorth, windeast
0.00000000,0.00000000,0.00000000,0.00000000,0.00000000,0.00000000
1.00000000,3.02502604,343260.68655952,384.26845401,-7.70828175,-0.45288215
2.00000000,3.01495320,342124.21684440,767.95286901,-7.71506536,-0.45123853
3.00000000,3.00489957,340989.57100678,1151.05303883,-7.72185550,-0.44959182

I would like to obtain the last two columns and put them into two separate lists, such that I get an output like:我想获得最后两列并将它们放入两个单独的列表中,这样我得到一个 output ,如:

list1 = [-7.70828175, -7.71506536, -7.71506536] list1 = [-7.70828175, -7.71506536, -7.71506536]

list2 = [-0.45288215, -0.45123853, -0.44959182] list2 = [-0.45288215, -0.45123853, -0.44959182]

I have tried reading the data with the following code as shown below, but instead of separate columns and rows I just get one whole column with three rows in return.我尝试使用以下代码读取数据,如下所示,但我只得到一整列和三行而不是单独的列和行。

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

file = open('data.log', 'r')

df = pd.read_csv('data.log', sep='\\s+')

df = list(df)

print (df[0])

Could someone indicate what I have to adjust in my code to obtain the required output as indicated above?有人可以指出我必须在我的代码中调整什么以获得如上所述的所需 output 吗?

Thanks in advance!提前致谢!

The error comes in the sep attribute.错误出现在sep属性中。 If you remove it, it will use the default (the comma) which is the one you need:如果您删除它,它将使用您需要的默认值(逗号):

eg例如

>>> import pandas as pd
>>> import numpy as np
>>> file = open('data.log', 'r')
>>> df = pd.read_csv('data.log')  # or use sep=','
>>> df = list(df)
>>> df[0]
'1.00000000'
>>> df[5]
'-0.45288215'

Plus use skiprows to get out the headers.再加上使用skiprows来获取标题。

import pandas as pd 
df = pd.read_csv('sample.txt', skiprows=3, header=None, 
                 names=['time', 'ff', 'T vector', 'dist', 'windnorth', 'windeast'])
spam = list(df['windeast'])
print(spam)
# store a specific column in a list
df['wind_diff'] = df.windnorth - df['windeast'] # two different ways to access columsn
print(df)
print(df['wind_diff'])

output output

[-0.45288215, -0.45123853, -0.44959182]
   time        ff       T vector         dist  windnorth  windeast  wind_diff
0   1.0  3.025026  343260.686560   384.268454  -7.708282 -0.452882  -7.255400
1   2.0  3.014953  342124.216844   767.952869  -7.715065 -0.451239  -7.263827
2   3.0  3.004900  340989.571007  1151.053039  -7.721856 -0.449592  -7.272264
0   -7.255400
1   -7.263827
2   -7.272264
Name: wind_diff, dtype: float64

Note, for creating plot in matplotlib you can work with pandas.Series directly, no need to store it in a list.注意,要在 matplotlib 中创建 plot,您可以直接使用 pandas.Series,无需将其存储在列表中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM