简体   繁体   English

Python 将带有特殊字符的.txt文件转换为dataframe

[英]Python convert a .txt file with special characters into dataframe

I have a '.txt' file and I want to import, convert it into a dataframe.我有一个“.txt”文件,我想导入,将其转换为 dataframe。 I am running into issues.我遇到了问题。

My code:我的代码:

#The raw.txt file content: 
#A& B & C & D & E
#foo& 13.52 & 333.2 & 4504.4 & 0
#1 taw & 13.49 & 314.6 & 4.6 & 1.29
#2 ewq & 35.44 & 4.2 & 5.2 & 3.06
#3 asd & 13.41 & 4.1 & 6.8 & 5.04
#4 er & 13.37 & 230.0 & 7.1 & 7.07
#5 we & 13.33 & 199.7 & 8.9 & 9.12
#6 wed & 13.27 & 169.4 & 8.6 & 11.17

import pandas as pd
df = pd.read_csv('raw.txt', delimiter = "\n",sep=" & ")

print(df.columns)

Index(['A& B & C & D & E'], dtype='object')

It did not quite convert into a dataframe.它并没有完全转换成 dataframe。 It failed to recognize the columns.它无法识别列。 Just reads all them as one column.只需将它们全部读取为一列。

delimiter and sep are actually alias. delimitersep实际上是别名。 You can use either of them, and use skiprows=1 to ignore the first rows:您可以使用其中任何一个,并使用skiprows=1忽略第一行:

pd.read_csv('filename.txt', sep='\s*&\s*', skiprows=1)

Output: Output:

       #A      B      C       D      E
0    #foo  13.52  333.2  4504.4   0.00
1  #1 taw  13.49  314.6     4.6   1.29
2  #2 ewq  35.44    4.2     5.2   3.06
3  #3 asd  13.41    4.1     6.8   5.04
4   #4 er  13.37  230.0     7.1   7.07
5   #5 we  13.33  199.7     8.9   9.12
6  #6 wed  13.27  169.4     8.6  11.17

When using pd.read_csv() , delimiter is an alias for sep , you can read about it here .当使用pd.read_csv()时, delimitersep的别名,你可以在这里阅读。 Therefore, you are not selecting the correct delimiter for your file.因此,您没有为文件选择正确的分隔符。

You can use the following:您可以使用以下内容:

pd.read_csv("raw.txt", sep="&")

If you use sep=" & " , the second line of your file will throw an error as there aren't enough columns because you're missing a space at the beginning.如果您使用sep=" & " ,则文件的第二行将引发错误,因为没有足够的列,因为您在开头缺少空格。

And that should work.这应该有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM