Python-提取子字符串列表

Question

How to extract a list of sub strings based on some patterns in python? 如何基于python中的某些模式提取子字符串列表？

for example, 例如，

str = 'this {{is}} a sample {{text}}'.

expected result : a python list which contains 'is' and 'text' 预期结果：包含“ is”和“ text”的python列表

Answer 1

>>> import re
>>> re.findall("{{(.*?)}}", "this {{is}} a sample {{text}}")
['is', 'text']

Answer 2

You can use the following: 您可以使用以下内容：

res = re.findall("{{([^{}]*)}}", a)
print "a python list which contains %s and %s" % (res[0], res[1])

Cheers 干杯

Answer 3

Assuming "some patterns" means "single words between double {}'s": 假设“某些模式”表示“双{}之间的单个单词”：

import re 汇入

re.findall('{{(\\w*)}}', string) re.findall（'{{（\\ w *）}}'，字符串）

Edit: Andrew Clark's answer implements "any sequence of characters at all between double {}'s" 编辑：安德鲁·克拉克（Andrew Clark）的答案实现了“双{}之间的所有字符序列”

Answer 4

A regex-based solution is fine for your example, although I would recommend something more robust for more complicated input. 基于正则表达式的解决方案适合您的示例，尽管我会为更复杂的输入建议更健壮的方法。

import re

def match_substrings(s):
    return re.findall(r"{{([^}]*)}}", s)

The regex from inside-out: 由内而外的正则表达式：

[^}] matches anything that's not a '}' [^}]匹配非'}'的任何内容
([^}]*) matches any number of non-} characters and groups them ([^}]*)匹配任意数量的非}字符并将它们分组
{{([^}]*)}} puts the above inside double-braces {{([^}]*)}}将以上内容放在双括号内

Without the parentheses above, re.findall would return the entire match (ie ['{{is}}', '{{text}}'] . However, when the regex contains a group, findall will use that, instead. 如果没有上述括号， re.findall将返回整个匹配项（即['{{is}}', '{{text}}'] 。），但是，当正则表达式包含一个组时，findall将使用该组。

Answer 5

You could use a regular expression to match anything that occurs between {{ and }} . 您可以使用正则表达式来匹配{{和}}之间发生的任何事情。 Will that work for you? 那对你有用吗？

Generally speaking, for tagging certain strings in a large body of text, a suffix tree will be useful. 一般来说，对于标记大量文本中的某些字符串，后缀树将很有用。

Python-提取子字符串列表

问题描述

5 个解决方案

解决方案1
14 已采纳 2010-12-21 17:57:18

解决方案2
2 2010-12-21 17:58:44

解决方案3
2 2010-12-21 17:59:45

解决方案4
1 2010-12-21 18:01:58

解决方案5
0 2010-12-21 17:55:16

Python-提取子字符串列表

问题描述

5 个解决方案

解决方案1 14 已采纳 2010-12-21 17:57:18

解决方案2 2 2010-12-21 17:58:44

解决方案3 2 2010-12-21 17:59:45

解决方案4 1 2010-12-21 18:01:58

解决方案5 0 2010-12-21 17:55:16

解决方案1
14 已采纳 2010-12-21 17:57:18

解决方案2
2 2010-12-21 17:58:44

解决方案3
2 2010-12-21 17:59:45

解决方案4
1 2010-12-21 18:01:58

解决方案5
0 2010-12-21 17:55:16