简体   繁体   中英

How can I read a text file to dataframe in Pandas with columns with different lenght and missing data?

I have a text file like this:

在此处输入图像描述

As you can notice some values are missing data and the file contains some fields with a string with spaces.

I need an output like the following:

在此处输入图像描述

在此处输入图像描述

When missing data just leave the field blank. Also do not put a comma between the words "No" and "Presento". Is there a way to delimit and separate with comma each field according to a certain length? Here each field has a certain length but I don't know how to convert it to a dataframe.

I remember do something like this in bash with the function substr() .

Any idea?

Sorry about my english. Thank you in advance!

This can be done by a classic pandas.read_csv :

df = pd.read_csv(r'path_to_your_textfile.txt', sep='\t', header=None)

# Output:

print(df)

       0      1            2    3     4   5            6    7
0  Test1   90.0  No presento   67  99.0  67     Aprobado   89
1  Test2  100.0           96   76   NaN  76  No aprobado  100
2  Test3    NaN  No presento   89  80.0  99     Aprobado   78
3  Test4   78.0          100  100  83.0  88          NaN   96

If needed, you can save the dataframe to a new text file with a , separator with pandas.DataFrame.to_csv :

df.to_csv(r'path_to_your_new_texfile.txt', header=None, index=False)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM