[英]Replacing nan with None in pandas dataframe MultiIndex
I am trying to replace nan with None in a pandas dataframe MultiIndex.我试图在 pandas dataframe MultiIndex 中用 None 替换 nan。 It seems like None is converted to nan in MultiIndex (but not in other index types).似乎 None 在 MultiIndex 中被转换为 nan (但不是在其他索引类型中)。
Following does not work (Taken from the question Replace NaN in DataFrame index )以下不起作用(取自问题Replace NaN in DataFrame index )
df = pd.DataFrame([['a', True, 1], ['b', True, 2], ['c', False, 3], ['d', None, 4]], columns=['c1', 'c2', 'c3'])
df.set_index(['c1','c2'], inplace=True)
df.index = pd.MultiIndex.from_frame(df.index.to_frame().fillna(np.nan).replace([np.nan], [None]))
df
c3
c1 c2
a True 1
b True 2
c False 3
d NaN 4
type(df.index[3][1])
<class 'float'>
Neither does也没有
index_tuples = [tuple(row) for row in df.index.to_frame().fillna(np.nan).replace([np.nan], [None]).values]
pd.MultiIndex.from_tuples(index_tuples)
MultiIndex([('a', True),
('b', True),
('c', False),
('d', nan)],
)
type(df.index[3][1])
<class 'float'>
It seems None is converted to NaN in MultiIndex.似乎在 MultiIndex 中 None 被转换为 NaN。
PS. PS。 It works for other index types:它适用于其他索引类型:
df = pd.DataFrame([['a', True, 1], ['b', True, 2], ['c', False, 3], ['d', None, 4]], columns=['c1', 'c2', 'c3'])
df.set_index('c2', inplace=True)
>>> df
c1 c3
c2
True a 1
True b 2
False c 3
NaN d 4
>>> df.index = df.index.fillna(value=np.nan).to_series().replace([np.nan], [None])
>>> df
c1 c3
c2
True a 1
True b 2
False c 3
NaN d 4
>>> type(df.index[3])
<class 'NoneType'>
>>>
The only way I managed to do it is by manipulating the numpy array directly.我设法做到这一点的唯一方法是直接操作 numpy 阵列。 Seems like any assignment of None
values by a MultiIndex
in pandas results in conversion to NaN
似乎MultiIndex
中的 MultiIndex 对None
值的任何分配都会导致转换为NaN
import pandas as pd
import numpy as np
df = pd.DataFrame([['a', True, 1], ['b', True, 2], ['c', False, 3], ['d', None, 4]], columns=['c1', 'c2', 'c3'])
df.set_index(['c1','c2'], inplace=True)
def replace_nan(x):
new_x = []
for v in x:
try:
if np.isnan(v):
new_x.append(None)
else:
new_x.append(v)
except TypeError:
new_x.append(v)
return tuple(new_x)
print('Before:\n', df.index)
idx = df.index.values
idx[:] = np.vectorize(replace_nan, otypes=['object'])(idx) # Replace values in np.array
print('After:\n', df.index)
Result:结果:
Before:
MultiIndex([('a', True),
('b', True),
('c', False),
('d', nan)],
names=['c1', 'c2'])
After:
MultiIndex([('a', True),
('b', True),
('c', False),
('d', None)],
names=['c1', 'c2'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.