[英]How to count intersections between columns with non-null values and row
Hey so the title might be hard to understand so basically here's a small sample of my DataFrame. 嘿,标题可能很难理解,因此基本上这是我的DataFrame的一小部分。
A B C D E F G H J K action
0 22 noise
1 68 junk
2 93 junk
3 80 junk
4 57 noise
The actions column only has two values (noise and junk). “操作”列只有两个值(“噪声”和“垃圾”)。 For instance in the first initial row column 'F' has a value of 22 and it's action is noise, and I want to count how many times 'F' has a non-null value when action is 'noise' and 'F' when action is 'junk'.
例如,在第一行的第一列中,“ F”的值为22,它的作用是噪声,而我想计算当操作为“噪声”时,“ F”具有非空值的次数,当操作为“噪声”时,动作是“垃圾”。 Of course I want to count this for all the other single letter columns also.
当然,我还要对所有其他单个字母列进行计数。 So I want to have a dictionary that likely looks like this where the inner dictionary has counts per action.
因此,我想拥有一个看起来像这样的字典,其中内部字典具有每个动作的计数。
{'F': {'noise': 1, 'junk': 0},
'G': {'noise': 0, 'junk': 1},
'E': {'noise': 0, 'junk': 1},
'C': {'noise': 0, 'junk': 1},
'J': {'noise': 1, 'junk': 0}
}
I've tried going through with df.iterrows() and df.notnull() but I can't seem to get the logic right. 我已经尝试过df.iterrows()和df.notnull(),但是我似乎无法理解正确的逻辑。
edit - Updated the expected output. 编辑-更新了预期的输出。
使用notnull()
来掩盖你的df
, groupby
每个动作和简单的sum
df.iloc[:, :-1].notnull().astype(int).groupby(df.action).sum().to_dict()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.