[英]Python convert CSV section of list to Pandas dataframe
I am pulling in a text file that has a lot of different data: Serial Num, Type, and a log of csv data: 我要输入一个文本文件,其中包含许多不同的数据:序列号,类型和csv数据的日志:
A123>
A123>
A123>read sn
A123>读取sn
sn = 12143
sn = 12143
A123>read cms-sn
A123>读取cms-sn
cms-sn = 12143-00000000-0000
cms-sn = 12143-00000000-0000
A123>read fw-rev
A123>阅读固件版本
fw-rev = 1.3, 1.3
fw-rev = 1.3,1.3
A123>read log
A123>读取日志
log =
log =
1855228,1,0,41,-57,26183,25,22,21,22,0,0,0,89,2048,500,0,0
1855228,1,0,41,-57,26183,25,22,21,22,0,0,0,89,2048,500,0,0
1855240,1,0,33,0,26319,25,22,22,23,0,0,0,89,2048,500,0,0
1855240,1,0,33,0,26319,25,22,22,23,0,0,0,89,2048,500,0,0
2612010,1,0,41,-82,26122,20,21,21,21,0,0,0,87,2048,500,0,0
2612010,1,0,41,-82,26122,20,21,21,21,0,0,0,87,2048,500,0,0
2612142,1,0,49,301,27607,21,22,21,21,0,0,0,81,2048,500,0,0
2612142,1,0,49,301,27607,21,22,21,21,0,0,0,81,2048,500,0,0
Here is the code I have so far: 这是我到目前为止的代码:
import pandas as pd
lines = [] # Declare an empty list named "lines"
with open ('03-22-2019.txt', 'rt') as in_file: # Open file
for line in in_file: # For each line of text in in_file, where the data is named "line",
lines.append(line.rstrip('\n')) # add that line to our list of lines, stripping newlines.
while('' in lines):
lines.remove("")
lines = [x for x in lines if 'A123' not in x] #delete all lines with 'A123'
for element in lines: # For each element in our list,
print(element) # print it.
split_line = lines[0].split() # create list with serial number line
Serial_Num = split_line[-1]
print(Serial_Num)
split_line = lines[1].split() # go to line with CMS SN
CMS_SN = split_line[-1]
print(CMS_SN)
split_line = lines[2].split()
Firm_Rev_1 = split_line[-1]
Firm_Rev_2 = split_line[-2]
print(Firm_Rev_1)
print(Firm_Rev_2)
# Problem section starts here!
start_data = lines.index("log =") + 1 #<<<<<<<<<<
data = [x for x in lines[start_data:].split(",")] #<<<<<<<<<<
#dfObj = pd.DataFrame(lines[start_data:-1].split(",")) #<<<<<<<<<<
The problem come up when I am trying to import the log section of the data into a dataframe and split out the CSV values into their own column. 当我尝试将数据的日志部分导入数据框并将CSV值拆分到自己的列中时,出现问题。
How do I programmatically find the start of the log data, and read the data from there to the end into a Pandas dataframe? 如何以编程方式找到日志数据的开头,然后从那里将数据读取到Pandas数据帧的结尾?
It looks like you're pretty close. 看来您已经很接近了。
# this will get you a list of lists for each line.
data = [line.split(',') for line in lines[start_data:]]
# This should construct your data frame
dfObj = pd.DataFrame(data=data, columns=[list, of, column, names])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.