[英]Iterate over rows in a Dataframe and compare it to the rest of the rows
So i have a dataframe, which i am grouping by and then applying a function to it. 所以我有一个数据框,我将其分组,然后对其应用功能。 Now i want to check for each row in the frame check that against the remaining rows in the dataframe and if it matches some conditions i would like to add them to a different dataframe with some sort of tag and remove them from the orginal. 现在,我想检查框架中的每一行,以检查数据框中的其余行,如果匹配某些条件,我想使用某种标签将它们添加到其他数据框中,并将它们从原始标签中删除。 If it doesnt pass the conditions i keep the rows there and move on to the next row. 如果没有通过条件,我将这些行保留在那里,然后继续进行下一行。
eg 例如
time status number action fname lname
0 10.30 Active 2 0 Adrian Peter
1 11.01 Active 3 2 Peter Thomas
2 11.05 Passive 2 0 Thomas Adrian
3 11.07 Passive 2 1 Jen Anniston
so i do something like 所以我做类似的事情
df.groupby(status).apply(f)
def f(x):
I want to perform some tasks here and with the remaining dataframe
i want to see if index 0 has similar number and action in the
remaining data frame. If true i want to put this in a different dataframe and tag it and remove the pair from the origial df.
I want to then move on to the next index and do the same. If false after looking at all the data in the frame i want to delete this from the original df too
If your desired function (f) has side-effects, i'd use df.iterrows() and write the function in python. 如果您想要的函数(f)有副作用,我将使用df.iterrows()并在python中编写该函数。
for index, row in df.iterrows():
# Do stuff
You can also create a flag column with a boolean value evaluating your condition, then pop all rows that have that value set as true: 您还可以创建带有布尔值的标志列来评估您的条件,然后弹出所有将该值设置为true的行:
df['tagged'] = df.apply(lambda row: <<condition goes here>>, axis=1)
tagged_rows = df[df['tagged'] == True]
df = df[df['tagged'] != True]
(not 100% sure about the syntax, don't have an interpreter on hand) (不能100%确定语法,手边没有解释器)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.