I have a multi-indexed dataframe which contains some NaN
values inside its index and rows.
In:
import pandas as pd
import numpy as np
row1 = {'index1' : 'abc', 'col1' : 'some_value', 'col3' : True}
row2 = {'index2' : 'xyz', 'col2' : 'other_value', 'col3' : np.nan}
row3 = {'index1' : 'def', 'col1' : 'different_value', 'col3' : False}
row4 = {'index2' : 'uvw', 'col2' : 'same_value', 'col3' : np.nan}
df = pd.DataFrame([row1, row2, row3, row4])
df.set_index(['index1', 'index2'], inplace=True)
print(df)
Out:
col1 col2 col3
index1 index2
abc NaN some_value NaN True
NaN xyz NaN other_value NaN
def NaN different_value NaN False
NaN uvw NaN same_value NaN
Is there a possibility to get a subset of that dataframe by the condition col3 == True
which also includes all "subrows" of the row where that condition holds?
When I go for
print(df[df.col3 == True])
I get
col1 col2 col3
index1 index2
abc NaN some_value NaN True
which is the row where the condition holds. However, what I am looking for is
col1 col2 col3
index1 index2
abc NaN some_value NaN True
NaN xyz NaN other value NaN
, including the row which does not have the True
value itself but is a "subrow" of the row with index1 == abc
.
Is that possible? Or is the dataframe messed up and should be structured in a different way?
A simple solution would be to just use a condition on the padded col3
where the NaNs
are replaced with the value of the row they belong to. For example:
>>> df['col3'].fillna(method='pad')
index1 index2
abc NaN True
NaN xyz True
def NaN False
NaN uvw False
Name: col3, dtype: bool
Now you can apply the condition like this:
>>> df[df['col3'].fillna(method='pad')]
col1 col2 col3
index1 index2
abc NaN some_value NaN True
NaN xyz NaN other_value NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.