簡體   English   中英

PYTHON:根據包含間隔|的dict填充df的nan值 TypeError:不可排序的類型:Interval()<int()

[英]PYTHON: Fill nan values of df according to dict that comprises of Intervals | TypeError: unorderable types: Interval() < int()

最后一條語句返回:TypeError:不可排序的類型:Interval()<int()

j = pd.DataFrame({'a':[12,16,23,27,22,36,31,38], 'b':[np.nan, 23, 58, 
np.nan, np.nan, np.nan, 76, np.nan]})

bin = [0, 10, 20, 30, 40]

k = pd.cut(c.a, bin)

j['new'] = k

groupby = j.groupby('new').b.median()   #computation doesn't matter

dict = groupby.to_dict()

j['b'] = j['b'].fillna(j['new'].map(dict))

我已經嘗試使用簡單的浮點數而不是間隔來進行此操作,並且效果很好

對我來說,它很好用,也許需要最新版本的pandas 0.20.2

j = pd.DataFrame({'a':[12,16,23,27,22,36,31,38], 
                  'b':[np.nan, 23, 58, np.nan, np.nan, np.nan, 76, np.nan]})

bins = [0, 10, 20, 30, 40]
j['new'] = pd.cut(j.a, bins)
print (j)
    a     b       new
0  12   NaN  (10, 20]
1  16  23.0  (10, 20]
2  23  58.0  (20, 30]
3  27   NaN  (20, 30]
4  22   NaN  (20, 30]
5  36   NaN  (30, 40]
6  31  76.0  (30, 40]
7  38   NaN  (30, 40]

d = j.groupby('new').b.median().to_dict()
print (d)
{Interval(30, 40, closed='right'): 76.0, 
 Interval(0, 10, closed='right'): nan, 
 Interval(10, 20, closed='right'): 23.0, 
 Interval(20, 30, closed='right'): 58.0}

j['b'] = j['b'].fillna(j['new'].map(d))
print (j)
    a     b       new
0  12  23.0  (10, 20]
1  16  23.0  (10, 20]
2  23  58.0  (20, 30]
3  27  58.0  (20, 30]
4  22  58.0  (20, 30]
5  36  76.0  (30, 40]
6  31  76.0  (30, 40]
7  38  76.0  (30, 40]

Simplier解決方案:

j['b'] = j.groupby(pd.cut(j.a, bins))['b'].apply(lambda x: x.fillna(x.median()))
print (j)
    a     b
0  12  23.0
1  16  23.0
2  23  58.0
3  27  58.0
4  22  58.0
5  36  76.0
6  31  76.0
7  38  76.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM