使用正则表达式匹配列表中的单词，

Question

我对正则表达式很陌生，正在尝试查找以列表中每个单词的每个字母开头的所有单词。

例如，我有清单：

[' MRI', 'fMRI ', 'PPE', 'FFE']

我正在尝试使用与这些匹配的单词中的字母在文本中查找单词，如果不匹配则忽略它。

因此，对于上面的列表，查找文本是否包含

Magnetic resonance imaging
functional Magnetic resonance imaging
personal protection equipment
None

我发现了几种方法可以做到这一点，但不是当单词在列表中时。

有人能在这里提供帮助，将不胜感激。

Answer 1

使用re库。 如果不区分大小写，请在其中使用flags=re.I选项。

import re
acronyms=['  MRI', 'fMRI', 'PPE', 'FFE']
text="""pull porous experiment
 public protection expertise
personal protective 
equipment
here is a magnetic resonance interglobular section
with a certain energy measure is on a table"""
matched={}
for a in acronyms:
  pattern=''
  for letter in a.strip():
    pattern+='[ ]*{}[^ \n]+[ \n]+'.format(letter)
  pattern+=''
  print(a.strip(),pattern)
  matched.update({a.strip():re.findall(pattern,text,flags=re.I)})

print(matched)

match 现在应该包含一个字典，其中包含每个首字母缩写词和每个首字母缩写词的匹配列表。

现在matched的 output 是（注意首字母缩略词已去除前导和尾随空格）

{'MRI': [' magnetic resonance interglobular '], 'fMRI': [], 'PPE': ['pull porous experiment\n ', 'public protection expertise\n', 'personal protective \nequipment\n'], 'FFE': []}

这允许结果跨越多行，但那些行尾字符 ( \n ) 包含在匹配结果中。 如果您更喜欢那些是空格，您可以使用例如re.sub替换[\n ]+为 .

这是关于re库的参考： https://docs.python.org/3/library/re.html 。 这是正则表达式的许多可能有用的通用解释之一： https://docs.python.org/3/howto/regex.html#regex-howto 。

使用正则表达式匹配列表中的单词，

问题描述

1 个解决方案

解决方案1
1 2020-04-20 20:51:37

使用正则表达式匹配列表中的单词，

问题描述

1 个解决方案

解决方案1 1 2020-04-20 20:51:37

解决方案1
1 2020-04-20 20:51:37