Undesired behavior : pandas.combine
turns ints to floats.
Description : My DataFrame contains a list of filenames (index) and some metadata about each:
pags rating tms glk
name
file1 original0 1 1 1
file2 original1 2 2 2
file3 original2 3 3 3
file4 original3 4 4 4
file5 original4 5 5 5
Sometimes I need to update some of the columns for some of the files, leaving all other cells unchanged.
Furthermore, the update can contain new files that I need to add as new rows (probably with some N/As).
The update comes in the form of another DataFrame upd
:
pags rating
name
file4 new0 11
file5 new1 12
file6 new2 13
file7 new3 14
Here, I want to change pags
and rating
for files 4,5 and append new rows for files 6,7.
I found I can do this with pd.combine
:
df = df.combine(upd, lambda old,new: new.fillna(old), overwrite=False)[df.columns]
pags rating tms glk
name
file1 original0 1.0 1.0 1.0
file2 original1 2.0 2.0 2.0
file3 original2 3.0 3.0 3.0
file4 new0 11.0 4.0 4.0
file5 new1 12.0 5.0 5.0
file6 new2 13.0 NaN NaN
file7 new3 14.0 NaN NaN
The only problem is that all integer columns turned to floating points.
How do I keep the original dtypes
?
I strongly want to avoid manual .astype()
for every column.
Code to create this example :
df = pd.DataFrame({
'name': ['file1','file2','file3','file4','file5'],
'pags': ["original"+str(i) for i in range(5)],
'rating': [1, 2, 3, 4, 5],
'tms': [1, 2, 3, 4, 5],
'glk': [1, 2, 3, 4, 5],
}).set_index('name')
upd = pd.DataFrame({
'name': ['file4','file5','file6','file7'],
'pags': ["new"+str(i) for i in range(4)],
'rating': [11, 12, 13, 14],
}).set_index('name')
df = df.combine(upd, lambda old,new: new.fillna(old), overwrite=False)[df.columns]
Unless I missed something, you do not have to cast .astype() for every column , only once for the whole dataframe, like this:
df = (
df.combine(upd, lambda old, new: new.fillna(old), overwrite=False)[df.columns]
.fillna(0)
.astype(int, errors="ignore")
)
print(df)
# Output
pags rating tms glk
name
file1 original0 1 1 1
file2 original1 2 2 2
file3 original2 3 3 3
file4 new0 11 4 4
file5 new1 12 5 5
file6 new2 13 0 0
file7 new3 14 0 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.