繁体   English   中英

从字符串中提取特定单词

[英]Extract specific words from string

我有一个这样的数据框:

Column_A
1. A lot of text inhere, but I want all words that have a comma in the middle. Like this: hello,world. A string can contain multiple relevant words, like hello,python and we have also many                         whit                spaces              in          the text   
2. What I want is to abstract,all words with that pattern. Not sure if it has an impact, but some parts of the strings containing "this signs". or "this,signs"                                     thanks  for helpingme                    greets! 

期望的结果:

hello,world
hello,python
abstract,all
"this,signs"

我尝试使用以下代码执行此操作:

df['B'] = df['Column_A'].str.findall(r',').str.join(' ').str.strip()

但是,这并没有给我带来理想的结果。

给定预期输出的特定格式,似乎可以使用:

from itertools import chain

l = chain.from_iterable(df.Column_a.str.findall(r'\w+,\w+').values.tolist())
pd.Dataframe(l, columns=['Column_A'])

      Column_A
0   hello,world
1  hello,python
2  abstract,all
3    this,signs

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM