From this Pandas data frame:
df = pd.DataFrame({'a': ['foo_abc', 'bar_def', 'ghi'], 'b': ['foo', 'bar', 'yah']})
a b
0 foo_abc foo
1 bar_def bar
2 ghi yah
I want to, probably with regex, remove the string in b
column from string of a
column to produce
a b c
0 foo_abc foo abc
1 bar_def bar def
2 ghi yah ghi
How could I do this with Pandas?
Use replace
with strip
in list comprehension:
df['c'] = [a.replace(b, '').strip('_') for a, b in zip(df['a'], df['b'])]
print (df)
a b c
0 foo_abc foo abc
1 bar_def bar def
2 ghi yah ghi
Solution with re.sub
:
df['c'] = [re.sub('^({}_)'.format(b), '', a) for a, b in zip(df['a'], df['b'])]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.