简体   繁体   中英

Pandas data frame fill null values with index

I have a dataframe where for one column I want to fill null values with the index value. What is the best way of doing this?

Say my dataframe looks like this:

>>> import numpy as np
>>> import pandas as pd
>>> d=pd.DataFrame(index=['A','B','C'], columns=['Num','Name'], data=[[1,'Andrew'], [2, np.nan], [3, 'Chris']])
>>> print d

  Num    Name
A    1  Andrew
B    2     NaN
C    3   Chris

I can use the following line of code to get what I'm looking for:

d['Name'][d['Name'].isnull()]=d.index

However, I get the following warning: "A value is trying to be set on a copy of a slice from a DataFrame"

I imagine it'd be better to do this either using fillna or loc, but I can't figure out how to do this with either. I have tried the following:

>>> d['Name']=d['Name'].fillna(d.index)

>>> d.loc[d['Name'].isnull()]=d.index

Any suggestions on which is the best option?

IMO you should use fillna , as the Index type is not an acceptable data type for the fill value you need to pass a series. Index has a to_series method:

In [13]:
d=pd.DataFrame(index=['A','B','C'], columns=['Num','Name'], data=[[1,'Andrew'], [2, np.nan], [3, 'Chris']])
d['Name']=d['Name'].fillna(d.index.to_series())
d

Out[13]:
   Num    Name
A    1  Andrew
B    2       B
C    3   Chris

我会在这种情况下使用.loc ,如下所示:

d.loc[d['Name'].isnull(), 'Name'] = d.loc[d['Name'].isnull()].index

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM