简体   繁体   English

Python将列表的CSV部分转换为Pandas数据框

[英]Python convert CSV section of list to Pandas dataframe

I am pulling in a text file that has a lot of different data: Serial Num, Type, and a log of csv data: 我要输入一个文本文件,其中包含许多不同的数据:序列号,类型和csv数据的日志:

A123> A123>

A123>read sn A123>读取sn

sn = 12143 sn = 12143

A123>read cms-sn A123>读取cms-sn

cms-sn = 12143-00000000-0000 cms-sn = 12143-00000000-0000

A123>read fw-rev A123>阅读固件版本

fw-rev = 1.3, 1.3 fw-rev = 1.3,1.3

A123>read log A123>读取日志

log = log =

1855228,1,0,41,-57,26183,25,22,21,22,0,0,0,89,2048,500,0,0 1855228,1,0,41,-57,26183,25,22,21,22,0,0,0,89,2048,500,0,0

1855240,1,0,33,0,26319,25,22,22,23,0,0,0,89,2048,500,0,0 1855240,1,0,33,0,26319,25,22,22,23,0,0,0,89,2048,500,0,0

2612010,1,0,41,-82,26122,20,21,21,21,0,0,0,87,2048,500,0,0 2612010,1,0,41,-82,26122,20,21,21,21,0,0,0,87,2048,500,0,0

2612142,1,0,49,301,27607,21,22,21,21,0,0,0,81,2048,500,0,0 2612142,1,0,49,301,27607,21,22,21,21,0,0,0,81,2048,500,0,0

Here is the code I have so far: 这是我到目前为止的代码:

import pandas as pd

lines = []                  # Declare an empty list named "lines"
with open ('03-22-2019.txt', 'rt') as in_file:  # Open file 
    for line in in_file:  # For each line of text in in_file, where the data is named "line",
        lines.append(line.rstrip('\n'))   # add that line to our list of lines, stripping newlines.

while('' in lines):
        lines.remove("")

lines = [x for x in lines if 'A123' not in x]  #delete all lines with 'A123'


for element in lines:            # For each element in our list,
        print(element)              # print it.


split_line = lines[0].split()  # create list with serial number line
Serial_Num = split_line[-1]
print(Serial_Num)

split_line = lines[1].split()  # go to line with CMS SN
CMS_SN = split_line[-1]
print(CMS_SN)

split_line = lines[2].split()
Firm_Rev_1 = split_line[-1]
Firm_Rev_2 = split_line[-2]
print(Firm_Rev_1)
print(Firm_Rev_2)
                                  #  Problem section starts here!
start_data = lines.index("log =") + 1                   #<<<<<<<<<<
data = [x for x in lines[start_data:].split(",")]       #<<<<<<<<<<
#dfObj = pd.DataFrame(lines[start_data:-1].split(","))  #<<<<<<<<<<

The problem come up when I am trying to import the log section of the data into a dataframe and split out the CSV values into their own column. 当我尝试将数据的日志部分导入数据框并将CSV值拆分到自己的列中时,出现问题。

How do I programmatically find the start of the log data, and read the data from there to the end into a Pandas dataframe? 如何以编程方式找到日志数据的开头,然后从那里将数据读取到Pandas数据帧的结尾?

It looks like you're pretty close. 看来您已经很接近了。

# this will get you a list of lists for each line.
data = [line.split(',') for line in lines[start_data:]]
# This should construct your data frame
dfObj = pd.DataFrame(data=data, columns=[list, of, column, names])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM