简体   繁体   English

Pandas - 获取 pandas 中一行的索引应用 function

[英]Pandas - getting an index of a row in a pandas apply function

Here is my pandas code:这是我的 pandas 代码:

def calcObj(row):
    d = dict(calc1 = iferror(row.Hours1.sum(), row.Hours2.sum(), 0, '+'))
    
    if row.Process == 'A': # this doesn't work
        d['ProcessKey'] = 700
    else:
        d['ProcessKey'] = 500
    
    return pd.Series(d)

    
df.groupby(['MainProcess']).apply(calcObj)

I am trying to check if a process name is A and if it is return a different value.我正在尝试检查进程名称是否为 A 以及它是否返回不同的值。 Unfortunately it doesn't work and i get the following error:不幸的是它不起作用,我收到以下错误:

AttributeError: 'DataFrame' object has no attribute 'Process ' AttributeError: 'DataFrame' object 没有属性 'Process'

I assume it's because i am not grouping by process only by MainProcess.我认为这是因为我不是仅按 MainProcess 按进程分组。

Is there any way to get access to this item within the apply function?有什么方法可以在申请 function 中访问此项目? Any other work-around would also be very helful任何其他解决方法也将非常有帮助

Here is my example dataframe, BG/MainProcess, CoreProcess and Process1 are indexes, Hours1/Hours2 are columns:这是我的示例 dataframe,BG/MainProcess,CoreProcess 和 Process1 是索引,Hours1/Hours2 是列:

Bg              MainProcess                CoreProcess          Process      Hours1      Hours2                                        
Building1       MainProcess-1              CoreProcess-1        S-Process-1     150         250
                                                                S-Process-2     150         250
                                           CoreProcess-2        S-Process-3     150         250
                                                                S-Process-1     150         250
                                                                S-Process-2     150         250
Building2       MainProcess-2              CoreProcess-3        S-Process-1     150         250
                                                                S-Process-2     150         250
                MainProcess-3              CoreProcess-4        S-Process-1     150         250
                                                                S-Process-2     150         250
                                                                S-Process-3     150         250

Beware, the columns in the index are not columns of the DataFrame!请注意,索引中的列不是 DataFrame 的列!

In your example, as Process is in the (multi-) index, df['Process'] will raise a KeyError independently of the groupby .在您的示例中,由于Process在(多)索引中, df['Process']独立于 groupby引发KeyError You must reset the column from the index to be able to use it.您必须从索引中重置列才能使用它。 For example you could reset it before the groupby:例如,您可以在 groupby 之前重置它:

df.reset_index(level='Process').groupby(['MainProcess']).apply(calcObj)

But beware : calcObj will not receive rows here but the sub-dataframes having same values in the MainProcess column...请注意calcObj不会在此处接收,但在MainProcess列中具有相同值的子数据帧...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM