I'm trying to find the difference between two Pandas MultiIndex
objects of different shapes. I've used:
df1.index.difference(df2)
and receive
TypeError: '<' not supported between instances of 'float' and 'str'
My indices are str and datetime, but I suspect there are NaNs
hidden there (the floats). Hence my question:
What's the best way to find the NaNs somewhere in the MultiIndex? How does one iterate through the levels and names? Can I use something like isna()
?
For MultiIndex
are not implemented many functions, you can check this .
You need convert MultiIndex
to DataFrame
by MultiIndex.to_frame
first:
#W-B sample
idx=pd.MultiIndex.from_tuples([(np.nan,1),(1,1),(1,2)])
print (idx.to_frame())
0 1
NaN 1 NaN 1
1 1 1.0 1
2 1.0 2
print (idx.to_frame().isnull())
0 1
NaN 1 True False
1 1 False False
2 False False
Or use DataFrame
constructor:
print (pd.DataFrame(list(idx.tolist())))
0 1
0 NaN 1
1 1.0 1
2 1.0 2
Because:
print (pd.isnull(idx))
NotImplementedError: isna is not defined for MultiIndex
EDIT:
For check at least one True
per rows use any
with boolean indexing
:
df = idx.to_frame()
print (df[df.isna().any(axis=1)])
0 1
NaN 1 NaN 1
Also is possible filter MultiIndex
, but is necessary add MultiIndex.remove_unused_levels
:
print (idx[idx.to_frame().isna().any(axis=1)].remove_unused_levels())
MultiIndex(levels=[[], [1]],
labels=[[-1], [0]])
We can using reset_index
, then with isna
idx=pd.MultiIndex.from_tuples([(np.nan,1),(1,1),(1,2)])
df=pd.DataFrame([1,2,3],index=idx)
df.reset_index().filter(like='level_').isna()
Out[304]:
level_0 level_1
0 True False
1 False False
2 False False
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.