df = pd.DataFrame([["a", "d"], ["", ""], ["", "3"]],
columns=["a", "b"])
df
a b
0 a d
1
2 3
I'm looking to do a vectorized string concatenation with an if statement like this:
df["c"] = df["a"] + "()" + df["b"] if df["a"].item != "" else ""
But it doesn't work because .item returns a series. Is it possible to do it like this without an apply or lambda method that goes through each row? In a vectorized operation pandas will try and concatenate multiple cells at a time and make it faster...
Desired output:
df
a b c
0 a d a ()b
1
2 3
Try this: using np.where()
df = pd.DataFrame([["a", "d"], ["", ""], ["", "3"]],
columns=["a", "b"])
df['c']=np.where(df['a']!='',df['a'] + '()' + df['b'],'')
print(df)
output:
a b c
0 a d a()d
1
2 3
IIUC you could use mask
to concatenate both columns, separated by some string using str.cat
, whenever a condition holds:
df['c'] = df.a.mask(df.a.ne(''), df.a.str.cat(df.b, sep='()'))
print(df)
a b c
0 a d a()d
1
2 3
Since nobody already mentioned it, you can also use the apply
method:
df['c'] = df.apply(lambda r: r['a']+'()'+r['b'] if r['a']!='' else '', axis=1)
If anyone checks performances please comment below :)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.