[英]How to create a column of zeros of a particular value based on a certain condition?
I have a df s19_df
in a dictionary Bgf
as follows: 我在字典
Bgf
有一个df s19_df
,如下所示:
BacksGas_Flow_sccm ContextID StepID Time_Elapsed iso_forest
61.81640625 7289972 19 40.503 -1
62.59765625 7289972 19 41.503 -1
63.671875 7289972 19 42.503 1
65.625 7289972 19 43.503 1
61.81640625 7289973 19 40.448 -1
62.59765625 7289973 19 41.448 -1
63.671875 7289973 19 42.448 1
65.625 7289973 19 43.448 1
I wrote a function to calculate the number of +1s and -1s in the iso_forest
by doing a groupby
on the ContextID
column and then calculate the ratio of -1/1: 我编写了一个函数,通过对
ContextID
列进行groupby
来计算iso_forest
中+1和-1的数量,然后计算-1/1的比率:
def minus1_plus1_ratio(dictionary, new_df, step_df):
dictionary[new_df] = dictionary[step_df].groupby(['ContextID', 'iso_forest']).size().reset_index(name='count')
dictionary[new_df] = pd.pivot_table(dictionary[new_df], values = 'count', columns = ['iso_forest'],
index = ['ContextID']).fillna(value = 0)
dictionary[new_df]['-1/1'] = (dictionary[new_df][-1])/(dictionary[new_df][1])
dictionary[new_df] = dictionary[new_df].sort_values(by = '-1/1', ascending = False)
return dictionary[new_df]
So, when I run the function on the above df 所以,当我在上面的df上运行该函数时
minus1_plus1_ratio(Bgf, 's19_-1/1', 's19_df')
it works perfectly fine since the iso_forest
column has both -1s and +1s 由于
iso_forest
列同时具有-1和+1,因此它工作得很好
But for a df as follows: 但是对于df,如下所示:
BacksGas_Flow_sccm ContextID StepID Time_Elapsed iso_forest
61.81640625 7289972 19 40.503 1
62.59765625 7289972 19 41.503 1
63.671875 7289972 19 42.503 1
65.625 7289972 19 43.503 1
61.81640625 7289973 19 40.448 1
62.59765625 7289973 19 41.448 1
63.671875 7289973 19 42.448 1
65.625 7289973 19 43.448 1
where there are no -1s and only +1s are present in the iso_forest
column, running the function throws a key error: -1
since there are no -1s. 在
iso_forest
列中没有-1且仅存在+1的iso_forest
,运行该函数会引发key error: -1
因为没有-1。
So, what I would like to do is, if there are no -1s, then before the 因此,我想做的是,如果没有-1,那么在
dictionary[new_df]['-1/1'] = (dictionary[new_df][-1])/(dictionary[new_df][1])
step, a column named -1
has to be created and it must be filled with zeros. 步骤,必须创建名为
-1
的列,并且必须用零填充。
Similarly, there might be cases where only -1s are present and +1s are not there. 同样,在某些情况下可能仅存在-1,而没有+1。 In such a situation, a column of +1s must be created and filled with zeros.
在这种情况下,必须创建一列+1,并用零填充。
Can someone help me with the logic here, as to how can I achieve this? 有人可以在这里提供我的逻辑上的帮助吗?
You can use unstack
and reindex
: 您可以使用
unstack
和reindex
:
(df.groupby('ContextID').iso_forest
.value_counts()
.unstack(level=0, fill_value=0)
.reindex([-1,1],fill_value=0).T
)
Output: 输出:
iso_forest -1 1
ContextID
7289972 0 4
7289973 0 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.