[英]Iterate over a groupby dataframe to operate in each row
I have a DataFrame like this:我有一个像这样的 DataFrame:
subject trial attended
0 1 1 1
1 1 3 0
2 1 4 1
3 1 7 0
4 1 8 1
5 2 1 1
6 2 2 1
7 2 6 1
8 2 8 0
9 2 9 1
10 2 11 1
11 2 12 1
12 2 13 1
13 2 14 1
14 2 15 1
I as trying to define a function for this, but it doesn't work:我试图为此定义一个 function ,但它不起作用:
def count_attended():
sum_reactive = 0
dict_attended = {}
for i, g in reactive.groupby(['subject']):
for row in g:
if g['attended'][row] == 1:
sum_reactive += 1
if sum_reactive == 4:
dict_attended.update({g['subject'] : g['trial'][row]})
return dict_attended
return dict_attended
I think that I don't have clear how to iterate inside each GroupBy dataframe.我认为我不清楚如何在每个 GroupBy dataframe 内部进行迭代。 I'm quite new using pandas.我是使用 pandas 的新手。
IIUC try, IIUC 尝试,
df = df.query('attended == 1')
df.loc[df.groupby('subject')['attended'].cumsum() == 4, ['subject', 'trial']].to_dict(orient='record')
Output: Output:
[{'subject': 2, 'trial': 9}]
Using groupby
with cumsum
will do the counting attended, then check to see when this value equals to 4 to create a boolean series.使用groupby
和cumsum
将进行计数,然后检查该值何时等于 4 以创建 boolean 系列。 You can use this boolean series to do boolean indexing to filter your dataframe to certain rows.您可以使用此 boolean 系列进行 boolean 索引以将 dataframe 过滤到某些行。 Lastly, with lock and column filtering select subject and trial.最后,使用锁和列过滤 select 主题和试验。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.