简体   繁体   English

使用 Python 将特定数据从文本文件转换为 csv

[英]Converting specific data from text file to csv using Python

I am trying to convert data from a text file into a csv to their corresponding column names.我正在尝试将文本文件中的数据转换为 csv 到相应的列名。

This is an example of the text file这是文本文件的示例

Run  1    Tbb= 20 C    Volt=3.093   Tamb= 20.13 C   

 \1b2JTriTemp 1.0.5


 AD Averaged 00f2(mV), 0001, 0001, 0001, 0001, 0001, 0001, 0000,
 RAW Values  00f2(xx), FFFF, FFFF, FFFF, FFFF, FFFF, FFFF,
 AD Averaged 0132(mV), 0001, 0001, 0004, 3061, 0002, 0001, 0000,
 RAW Values  0132(xx), 0000, 0002, 0006, 0D0F, 0003, 0000,

When I run this code:当我运行此代码时:

import pandas as pd

with open("sample data as comma.txt", "r") as f:
    data = f.readlines()
with open("sample data as comma.txt", "w") as f:
    for line in data:
        if "RAW" not in line:
            f.write(line)

df = pd.read_csv("sample data as comma.txt", delimiter=',')
df.columns = ['', 'TH+', 'Vacm', 'Vout', 'Bat mon', 'TH-', 'Vbat2', 'Vamb', '']
df.to_csv('Sample raw data CSV.csv')

I get an error "pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 8, saw 9".我收到错误消息“pandas.errors.ParserError:标记数据时出错。C 错误:第 8 行预期有 1 个字段,看到 9”。

It is important to note that I want to be able to keep the line that defines which run it is.重要的是要注意,我希望能够保留定义它是哪个运行的行。 EG "Run 1" along with its Tbb, Volt and Tamb. EG“Run 1”及其 Tbb、Volt 和 Tamb。 This can just be on one line before each data set.这可以只在每个数据集之前的一行上。

Also note that the Run line should be its own seperate row, and not sorted into the columns.另请注意,Run 行应该是它自己的单独行,而不是按列排序。

Here is an example of how it should end up:以下是它应该如何结束的示例: 这是它应该如何结束的一个例子

Any help/advice would be greatly appreciated, thanks!任何帮助/建议将不胜感激,谢谢!

Just do a little manipulation here and add those "blank" values.只需在此处进行一些操作并添加那些“空白”值。

So this code looks to see what the max number of columns needed, and then appends those extra "blank" values in those rows.因此,此代码查看所需的最大列数,然后在这些行中附加这些额外的“空白”值。

import pandas as pd

with open("sample data as comma.txt", "r") as f:
    data = f.readlines()
    
data = [x.strip().split(',') for x in data if "RAW" not in x]
max_len = max([len(i) for i in data])

for row in data:
    if len(row) < max_len:
        row += [''] * (max_len - len(row))

df = pd.DataFrame(data, columns = ['', 'TH+', 'Vacm', 'Vout', 'Bat mon', 'TH-', 'Vbat2', 'Vamb', ''] )
df.to_csv('Sample raw data CSV.csv', index=False)

Output: From your sample data输出:来自您的样本数据

print(df.to_string())
                                                        TH+   Vacm   Vout Bat mon    TH-  Vbat2   Vamb  
0  Run  1    Tbb= 20 C    Volt=3.093   Tamb= 20.13 C                                                    
1                                                                                                       
2                                 \1b2JTriTemp 1.0.5                                                    
3                                                                                                       
4                                                                                                       
5                               AD Averaged 00f2(mV)   0001   0001   0001    0001   0001   0001   0000  
6                               AD Averaged 0132(mV)   0001   0001   0004    3061   0002   0001   0000  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM