[英]How to select dataframe rows based on index value and columns value criteria?
I have a dataframe that is a stacked version of n=9
dataframes:我有一个 dataframe,它是
n=9
数据帧的堆叠版本:
df
f a config
0 0.491 0.368 old.000
1 0.369 0.333 old.000
2 0.372 0.276 old.000
3 0.346 0.300 old.000
4 0.213 0.161 old.000
.. ... ... ...
212 1.000 1.000 fin.111
213 1.000 1.000 fin.111
214 1.000 1.000 fin.111
215 1.000 1.000 fin.111
216 1.000 1.000 fin.111
[1953 rows x 3 columns]
Each "stacked" sub-dataframe corresponds to a different value of config
, ie:每个“堆叠”子数据帧对应于不同的
config
值,即:
df['config'].unique()
array(['old.000', 'fin.000', 'fin.001', 'fin.010', 'fin.011', 'fin.100',
'fin.101', 'fin.110', 'fin.111'], dtype=object)
I want to "filter" this dataframe by a criteria (composed of index
and config
value) given by a pd.Series:我想通过 pd.Series 给出的标准(由
index
和config
值组成)“过滤”这个 dataframe:
ser_criteria
0 old.001
1 fin.101
2 fin.100
3 fin.101
4 fin.101
...
212 fin.000
213 old.000
214 old.000
215 old.000
216 old.000
Length: 217, dtype: object
So, I would need my output to be given by:因此,我需要通过以下方式提供我的 output:
df_filtered
f a config
0 0.481 0.368 old.001
1 0.569 0.333 fin.101
2 0.672 0.276 fin.100
3 0.378 0.111 fin.101
4 0.987 0.213 fin.101
.. ... ... ...
212 0.500 0.111 fin.000
213 1.000 1.000 old.000
214 0.765 0.123 old.000
215 0.000 1.000 old.000
216 0.333 0.123 old.000
[217 rows x 3 columns]
What is the more efficient way to do this?更有效的方法是什么? The only way I could find was to do this element by element (from index)...
我能找到的唯一方法是逐个元素(从索引)执行此操作...
Convert index
to index
columns in both DataFrames and use right join:将
index
转换为两个 DataFrame 中的index
列并使用右连接:
df = (ser_criteria.rename('config')
.reset_index()
.merge(df.reset_index(), on=['index','config'],how='right')
.drop('index', axis=1))
If the index values in your filter criteria are not important, you can do the following:如果过滤条件中的索引值不重要,您可以执行以下操作:
import pandas as pd
# Assign your dataframe to df (example)
df = pd.read_csv("Documents/dataframe.tsv", sep="\t")
df
f a config
0 0.491 0.368 old.000
1 0.369 0.333 old.000
2 0.372 0.276 old.000
3 0.346 0.300 old.000
4 0.213 0.161 old.000
212 1.000 1.000 fin.111
213 1.000 1.000 fin.111
214 1.000 1.000 fin.111
215 1.000 1.000 fin.111
216 1.000 1.000 fin.111
# Assign your filter series to filter (example)
filter = pd.Series(('old.001', 'fin.101', 'fin.100', 'fin.101', 'fin.101', 'fin.000', 'old.000', 'old.000', 'old.000', 'old.000'))
filter
0 old.001
1 fin.101
2 fin.100
3 fin.101
4 fin.101
5 fin.000
6 old.000
7 old.000
8 old.000
9 old.000
dtype: object
# Then subset the dataframe by which rows have a config value in your filter
df[df['config'].isin(filter)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.