[英]Pandas: How to remove words in string which appear before a certain word from another column
I have a large csv file with a column containing strings.我有一个包含字符串的列的大型 csv 文件。 At the beginning of these strings there are a set of id numbers which appear in another column as below.
在这些字符串的开头有一组 id 编号,它们出现在下面的另一列中。
0 Home /buy /York /Warehouse /P000166770Ou... P000166770
1 Home /buy /York /Plot /P000165923A plot of la... P000165923
2 Home /buy /London /Commercial /P000165504A str... P000165504
...
804 Brand new apartment on the first floor, situat... P000185616
I want to remove all text which appears before the ID number so here we would get:我想删除出现在 ID 号之前的所有文本,所以在这里我们会得到:
0 Ou...
1 A plot of la...
2 A str...
...
804 Brand new apartment on the first floor, situat...
I tried something like我尝试了类似的东西
df['column_one'].str.split(df['column_two'])
and和
df['column_one'].str.replace(df['column_two'],'')
You could replace the pattern using regex as follows:您可以使用正则表达式替换模式,如下所示:
>> my_pattern = "^(Alpha|Beta|QA|Prod)\s[A-Z0-9]{7}"
>> my_series = pd.Series(['Alpha P17089OText starts here'])
>> my_series.str.replace(my_pattern, '', regex=True)
0 Text starts here
There is a bit of work to be done to determine the nature of your pattern.需要做一些工作来确定模式的性质。 I would suggest experimenting a bit with https://regex101.com/
我建议尝试一下https://regex101.com/
To extend your split()
idea:扩展您的
split()
想法:
df.apply(lambda x: x['column_one'].split(x['column_two'])[1], axis=1)
0 Text starts here
I managed to get it to work using:我设法让它工作使用:
df.apply(lambda x: x['column1'].split(x['column2'])[1] if x['column2'] in x['column1'] else x['column1'], axis=1)
This also works when the ID is not in the description.当 ID 不在描述中时,这也有效。 Thanks for the help!
谢谢您的帮助!
Here is one way to do it, by applying regex to each of the row based on the code这是一种方法,通过根据代码将正则表达式应用于每一行
import re
def ext(row):
mch = re.findall(r"{0}(.*)".format(row['code']), row['txt'])
if len(mch) >0:
rtn = mch.pop()
else:
rtn = row['txt']
return rtn
df['ext'] = df.apply(ext, axis=1)
df
0 Ou...
1 A plot of la...
2 A str...
3 Brand new apartment on the first floor situat...
x txt code ext
0 0 Home /buy /York /Warehouse / P000166770 Ou... P000166770 Ou...
1 1 Home /buy /York /Plot /P000165923A plot of la... P000165923 A plot of la...
2 2 Home /buy /London /Commercial /P000165504A str... P000165504 A str...
3 804 Brand new apartment on the first floor situat... P000185616 Brand new apartment on the first floor situat...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.