基于多列创建新列 pandas

Question

I'm trying to create categories based on multiple columns in pandas but it is taking forever to run so i'm not sure it is correct.我正在尝试基于 pandas 中的多个列创建类别，但它需要永远运行，所以我不确定它是否正确。 I left for 30 mns and was still running so stopped it.我离开了 30 分钟，但仍在运行，所以停止了它。 I'm trying to create a new column based on several other columns (in my actual data it is about 15 cols).我正在尝试基于其他几列创建一个新列（在我的实际数据中它大约是 15 列）。 However when I try on a smaller dataset it is very quick.但是，当我尝试使用较小的数据集时，它非常快。 Any suggestions?有什么建议么？

other_cols = ['col1', 'col2', 'col3', 'col4', 'col5']


def labels(row):
    if ((row['col 6'] > 1) & (row[other_cols] < 1)).all():
        return 'Yes'
    if ((row['col 6'] >1) & (row['col 7'] >1) & (row[other_cols] <1)).all():
        return 'Maybe'
    if ((row['col 6'] <1) & (row['col 7']>1) & (row[other_cols] <1)).all():
        return 'no'

df['category'] = df.apply(lambda row: labels(row), axis=1)

Answer 1

You can try that maybe:您可以尝试一下：

ther_cols = ['col1', 'col2', 'col3', 'col4', 'col5']


def labels(row):
    elif ((row['col 6'] > 1) & (row[other_cols] < 1)).all():
        row['category'] = 'Yes'
    elif ((row['col 6'] >1) & (row['col 7'] >1) & (row[other_cols] <1)).all():
        row['category'] = 'Maybe'
    elif ((row['col 6'] <1) & (row['col 7']>1) & (row[other_cols] <1)).all():
        row['category'] = 'no'
    else:
        row['category'] = ''

df = df.apply(labels, axis=1)

What is the size of your dataset?你的数据集的大小是多少？

I'm sorry i can not comment I am still new here对不起，我不能评论我还是新来的

基于多列创建新列 pandas

问题描述

1 个解决方案

解决方案1
0 2021-01-28 16:21:58

基于多列创建新列 pandas

问题描述

1 个解决方案

解决方案1 0 2021-01-28 16:21:58

解决方案1
0 2021-01-28 16:21:58