从 Python 中的日志文件读取数据

Question

Im trying to read data from a log file I have in Python.我试图从 Python 中的日志文件中读取数据。 Suppose the file is called data.log.假设该文件名为 data.log。 The content of the file looks as follows:该文件的内容如下所示：

# Performance log
# time, ff, T vector, dist, windnorth, windeast
0.00000000,0.00000000,0.00000000,0.00000000,0.00000000,0.00000000
1.00000000,3.02502604,343260.68655952,384.26845401,-7.70828175,-0.45288215
2.00000000,3.01495320,342124.21684440,767.95286901,-7.71506536,-0.45123853
3.00000000,3.00489957,340989.57100678,1151.05303883,-7.72185550,-0.44959182

I would like to obtain the last two columns and put them into two separate lists, such that I get an output like:我想获得最后两列并将它们放入两个单独的列表中，这样我得到一个 output ，如：

list1 = [-7.70828175, -7.71506536, -7.71506536] list1 = [-7.70828175, -7.71506536, -7.71506536]

list2 = [-0.45288215, -0.45123853, -0.44959182] list2 = [-0.45288215, -0.45123853, -0.44959182]

I have tried reading the data with the following code as shown below, but instead of separate columns and rows I just get one whole column with three rows in return.我尝试使用以下代码读取数据，如下所示，但我只得到一整列和三行而不是单独的列和行。

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

file = open('data.log', 'r')

df = pd.read_csv('data.log', sep='\\s+')

df = list(df)

print (df[0])

Could someone indicate what I have to adjust in my code to obtain the required output as indicated above?有人可以指出我必须在我的代码中调整什么以获得如上所述的所需 output 吗？

Thanks in advance!提前致谢！

Answer 1

The error comes in the sep attribute.错误出现在sep属性中。 If you remove it, it will use the default (the comma) which is the one you need:如果您删除它，它将使用您需要的默认值（逗号）：

eg例如

>>> import pandas as pd
>>> import numpy as np
>>> file = open('data.log', 'r')
>>> df = pd.read_csv('data.log')  # or use sep=','
>>> df = list(df)
>>> df[0]
'1.00000000'
>>> df[5]
'-0.45288215'

Plus use skiprows to get out the headers.再加上使用skiprows来获取标题。

Answer 2

import pandas as pd 
df = pd.read_csv('sample.txt', skiprows=3, header=None, 
                 names=['time', 'ff', 'T vector', 'dist', 'windnorth', 'windeast'])
spam = list(df['windeast'])
print(spam)
# store a specific column in a list
df['wind_diff'] = df.windnorth - df['windeast'] # two different ways to access columsn
print(df)
print(df['wind_diff'])

output output

[-0.45288215, -0.45123853, -0.44959182]
   time        ff       T vector         dist  windnorth  windeast  wind_diff
0   1.0  3.025026  343260.686560   384.268454  -7.708282 -0.452882  -7.255400
1   2.0  3.014953  342124.216844   767.952869  -7.715065 -0.451239  -7.263827
2   3.0  3.004900  340989.571007  1151.053039  -7.721856 -0.449592  -7.272264
0   -7.255400
1   -7.263827
2   -7.272264
Name: wind_diff, dtype: float64

Note, for creating plot in matplotlib you can work with pandas.Series directly, no need to store it in a list.注意，要在 matplotlib 中创建 plot，您可以直接使用 pandas.Series，无需将其存储在列表中。

从 Python 中的日志文件读取数据

问题描述

2 个解决方案

解决方案1
1 2021-03-25 13:10:05

解决方案2
1 已采纳 2021-03-25 13:14:46

从 Python 中的日志文件读取数据

问题描述

2 个解决方案

解决方案1 1 2021-03-25 13:10:05

解决方案2 1 已采纳 2021-03-25 13:14:46

解决方案1
1 2021-03-25 13:10:05

解决方案2
1 已采纳 2021-03-25 13:14:46