I'm trying to create categories based on multiple columns in pandas but it is taking forever to run so i'm not sure it is correct. I left for 30 mns and was still running so stopped it. I'm trying to create a new column based on several other columns (in my actual data it is about 15 cols). However when I try on a smaller dataset it is very quick. Any suggestions?
other_cols = ['col1', 'col2', 'col3', 'col4', 'col5']
def labels(row):
if ((row['col 6'] > 1) & (row[other_cols] < 1)).all():
return 'Yes'
if ((row['col 6'] >1) & (row['col 7'] >1) & (row[other_cols] <1)).all():
return 'Maybe'
if ((row['col 6'] <1) & (row['col 7']>1) & (row[other_cols] <1)).all():
return 'no'
df['category'] = df.apply(lambda row: labels(row), axis=1)
You can try that maybe:
ther_cols = ['col1', 'col2', 'col3', 'col4', 'col5']
def labels(row):
elif ((row['col 6'] > 1) & (row[other_cols] < 1)).all():
row['category'] = 'Yes'
elif ((row['col 6'] >1) & (row['col 7'] >1) & (row[other_cols] <1)).all():
row['category'] = 'Maybe'
elif ((row['col 6'] <1) & (row['col 7']>1) & (row[other_cols] <1)).all():
row['category'] = 'no'
else:
row['category'] = ''
df = df.apply(labels, axis=1)
What is the size of your dataset?
I'm sorry i can not comment I am still new here
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.