I imported tab delimited file to create a dataframe ( df
) which has the following label
:
label
1
2
3
1
This is stored as pandas.core.series.Series and I want to convert it to string format so that I can get rid of the decimals when I write this out to a text file.
df.class_label=df.label.fillna('')
df.to_string(columns=['label'],index=False)
The variable type is still Series
, and output (text file) also has the decimals:
1.0 2.0 3.0 1.0
How to get rid of these decimals?
您可以使用to_string()
方法的float_format
关键字参数:
df.to_string(columns=['label'], index=False, float_format=lambda x: '{:d}'.format(x))
Using astype(int)
will change a float
to an int
and will drop your .0
as desired.
import pandas as pd
df = pd.DataFrame({'label': [1.0, 2.0, 4.0, 1.0]})
print(df)
label
0 1.0
1 2.0
2 4.0
3 1.0
df.label = df.label.astype(int)
print(df)
label
0 1
1 2
2 4
3 1
Here we do not need to convert this to a string. This will be done when exporting to .csv
or .txt
and will preserve the int
.
I think you have some NaN
values, so int
are converted to float
because na type promotions .
So you can read data in column label
as str
and then it works nice:
import pandas as pd
import numpy as np
import io
temp=u"""lab1;label
5;1
5;2
7;
7;3
"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep=';', dtype={'label':str})
print (df)
lab1 label
0 5 1
1 5 2
2 7 NaN
3 7 3
df['class_label'] = df.label.fillna('')
print (df)
lab1 label class_label
0 5 1 1
1 5 2 2
2 7 NaN
3 7 3 3
print (df.to_string(columns=['class_label'],index=False))
class_label
1
2
3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.