简体   繁体   English

迭代熊猫行并获取值组?

[英]Iterate over pandas rows and get groups of values?

I have a Df that looks like:我有一个Df看起来像:

marker | 0 1 2 3
________________
A      | + - + -
B      | - - + -
C      | - + - -

and I want to iterate over the columns and obtain the names of the rows where there is a + ie group all + rows.我想遍历列并获取有+的行的名称,即对所有+行进行分组。

I attemped to do this by:我试图通过以下方式做到这一点:

lis = []

for n in list(range(0,3)):
    cli = Df[n].tolist()
    for x,m in zip(cli,markers): # markers is a list of the row names ['A','B','C']
        cl_li = []
        if x == '+':
            mset = m+x
            cl_li.append(mset)
        else:
            continue
        lis.append(cl_li)

print (lis)

But I am getting each row name as its own sublist in the name whereas I want something like:但是我将每一行名称作为名称中的自己的子列表,而我想要的是:

newdf = 
____________
0   |  A+
1   |  C+
2   |  A+B+

#n.b group 3 not included

Try using apply and join on a boolean matrix:尝试在布尔矩阵上使用applyjoin

(df == '+').apply(lambda x: '+'.join(x.index[x])+'+').to_frame()

Output:输出:

           0
marker      
0         A+
1         C+
2       A+B+

Or, using dot and boolean matrix:或者,使用dot和布尔矩阵:

(df.index.to_series()+'+').dot((df=='+'))

Output:输出:

           0
marker      
0         A+
1         C+
2       A+B+

My proposition is to use more pandasonic solution than yours.我的提议是使用比你更多的 Pandasonic 解决方案。

Apply a lambda function to each column:对每一列应用 lambda 函数:

result = df.apply(lambda col: ''.join(col[col == '+'].index + '+'))

To drop empty items from the result, run:要从结果中删除空项目,请运行:

result = result[result != '']

The result is:结果是:

0      A+
1      C+
2    A+B+
dtype: object

If you want the result as a DataFrame (instead of a Series ), run:如果您希望结果为 DataFrame (而不是Series ),请运行:

result = result[result != ''].to_frame()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM