I'm trying to concatenate multiple columns in Python. The columns to concatenate vary based on the values of some other columns. How can you do that efficiently?
I've already tried creating a key that groups the conditionals fields and combine that with a for loop that checks each row if it is in a specific group. Of course this takes really long to complete.
For example, given a data frame (df):
df = pd.DataFrame({'cond_1': ['A', 'B', 'B', 'C', 'D'],
'cond_2': ['one', 'two', 'three', 'three', 'four'],
'concat_1': ['Mon', 'Tue', 'Fri', 'Wed', 'Thu'],
'concat_2': ['Sep', 'Oct', 'Oct', 'Nov', 'Dec'],
'concat_3': ['first', 'second', 'second', 'third', 'fourth']})
I have the following set of rules:
- if cond_1 = 'A' then concat_1 + concat_2
- if cond_1 = 'B' then if cond_2 = 'two' then concat_1 + concat_3 else concat_1 + concat_2
- if cond_1 in ('C', 'D') then concat_2 + concat_3
that should result in the following:
cond_1 | cond_2 | concat_1 | concat_2 | concat_3 | result
---------------------------------------------------------
A | one | Mon | Sep | first | MonSep
B | two | Tue | Oct | second | Tuesecond
B | three | Fri | Oct | second | FriOct
C | three | Wed | Nov | third | Novthird
D | four | Thu | Dec | fourth | Decfourth
Thanks for your help!
You do this with apply
using a function to do the if
check and concatenation
like this
def concate_it(row):
if row['cond_1'] == 'A':
return row['concat_1'] + row['concat_2']
elif row['cond_1'] == 'B' and row['cond_2'] == 'two':
return row['concat_1'] + row['concat_3']
elif row['cond_1'] == 'B' and row['cond_2'] != 'two':
return row['concat_1'] + row['concat_2']
elif row['cond_1'] in ['C', 'D']:
return row['concat_2'] + row['concat_3']
df['result'] = df.apply(lambda row : concate_it(row), axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.