I have a pandas dataframe containing indices that have a one-to-many relationship. A very simplified and shortened example of my data is shown in the DataFrame Example link. I want to get a list or Series or ndarray of the unique namIdx values in which nCldLayers <= 1. The final result should show indices of 601 and 603.
I am able to accomplish this with the 3 statements below, but I am wondering if there is a much better, more succinct way with perhaps 'filter', 'select', or 'where'.
grouped=(namToViirs['nCldLayers']<=1).groupby(namToViirs.index).all(axis=0) grouped = grouped[grouped==True] filterIndex = grouped.index
Is there a better approach in accomplishing this result by applying the logical condition (namToViirs['nCldLayers >= 1) in a subsequent part of the chain, ie, first group then apply logical condition, and then retrieve only the namIdx where the logical result is true for each member of the group?
I think your code works nice, only you can add use small changes:
In all
can be omit axis=0
grouped==True
can be omit ==True
grouped=(namToViirs['nCldLayers']<=1).groupby(level='namldx').all()
grouped = grouped[grouped]
filterIndex = grouped.index
print (filterIndex)
Int64Index([601, 603], dtype='int64', name='namldx')
I think better is first filter by boolean indexing
and then groupby
, because less loops -> better performance.
For question 1, see jezrael answer. For question 2, you could play with indexes as sets:
namToViirs.index[namToViirs.nCldLayers <= 1] \
.difference(namToViirs.index[namToViirs.nCldLayers > 1])
You might be interested in this answer .
The implementation is currently a bit hackish, but it should reduce your statement above to:
filterIndex = ((namToViirs['nCldLayers']<=1)
.groupby(namToViirs.index).all(axis=0)[W].index)
EDIT: also see this answer for an analogous approach not requiring external components, resulting in:
filterIndex = ((namToViirs['nCldLayers']<=1)
.groupby(namToViirs.index).all(axis=0)[lambda x : x].index)
Another option is to use .pipe()
and a function which applies the desired filtering.
For instance:
filterIndex = ((namToViirs['nCldLayers']<=1)
.groupby(namToViirs.index)
.all(axis=0)
.pipe(lambda s : s[s])
.index)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.