How to split column element and replace by other column value in python dataframe?

Question

I want to replace col1 element with col2. For example if col1 contains abc i want to replace it with a{colb}c.

import pandas as pd
d = {'col1': ['a b', 'a c'], 'col2': ['z 26', 'y 25']}
df = pd.DataFrame(data=d)
print(df)
    col1     col2
    a b      z 26
    a c      y 25

output required if df['col1']=='a b'

    col1    col2     col3
0   a b     z 26     a z
1   a c     y 25     a c

I tried

df['col3'] = np.where(df[df['col1']=='a b'],(df['col1'].replace(str(df['col1'].str.split(' ')[1])),(str(df['col2'].str.split(' ')[0]))), 0)

error: ---error: operands could not be broadcast together with shapes (1,3) (2,) ()

&

for x in df['col1']:
  x.replace(df['col1'].str.split(' ')[1],df['col2'].str.split(' ')[1])
#error --replace() argument 1 must be str, not list

suggest easy solution...

Answer 1

I'm not entirely certain I've understood what you're trying to do here, but this does what you ask for:

for i, row in df2.iterrows():
    if row["col1"] == "a b":
         row["col1"] = "a " + row["col2"].split(" ")[0]

To iterate a dataframe row-wise, you use iterrows , which returns a tuple of (index, row) .

EDIT Note that modifying in place with this is undefined. If you don't want to use row directly you can modify the original df, if you need to:

df2["col1"][i] = row["col1"]

(After modifying the row .)

This is quite un-pandasy, and there is doubtless a way of doing this in one step with pandas, but this pattern will work with anything. Whether it is slower than a 'vectorised' solution depends on how exactly pandas implements iterrows and loc .

Note that the condition -- 'ab' ---is hard coded here, which seems to be what you want.

Answer 2

import pandas as pd
d = {'col1': ['a b', 'a c'], 'col2': ['z 26', 'y 25']}
df = pd.DataFrame(data=d)

Solution 1:

df.loc[(df['col1'] == 'a b'), 'col3'] = df['col1'].str[0] + ' ' + df['col2'].str[0]
df['col3'].fillna(df['col1'], inplace=True)

Solution 2:

condition = (df['col1'] == 'a b')
df['col3'] = np.where((df['col1'] == 'a b'), df['col1'].str[0] + ' ' + df['col2'].str[0], df['col1'])

Answer 3

While this is quite messy, I managed to do it in one line. I am honestly not sure if it does exactly what you want, but I can give you a hand modifying it if necessary.

df['col3'] = [f'{i[1].col1.split()[0]} {i[1].col2.split()[0]}' if i[1].col1 == 'a b' else i[1].col1 for i in df.iterrows()]

Answer 4

I found a way like this: split both column and join them. However I was looking for replacement and Insert

df['col3'] = df['col1'].apply(lambda x:x.split(' ') [0]) +' '+ df['col2'].apply(lambda x:x.split(' ') [0])

How to split column element and replace by other column value in python dataframe?

Question

4 answers

solution1
3 ACCPTED 2021-10-07 14:55:08

solution2
1 2021-10-08 11:30:21

solution3
0 2021-10-07 14:59:02

solution4
0 2021-10-07 15:22:13

How to split column element and replace by other column value in python dataframe?

Question

4 answers

solution1 3 ACCPTED 2021-10-07 14:55:08

solution2 1 2021-10-08 11:30:21

solution3 0 2021-10-07 14:59:02

solution4 0 2021-10-07 15:22:13

solution1
3 ACCPTED 2021-10-07 14:55:08

solution2
1 2021-10-08 11:30:21

solution3
0 2021-10-07 14:59:02

solution4
0 2021-10-07 15:22:13