正则表达式：查找所有带有某些字母但没有其他字母的单词

Question

can anyone is help me with that: 谁能帮助我：

I need to find all words from list containing letters [t OR d] AND [k OR c] but not any of [s,z,n,m] 我需要从包含字母[t OR d]和[k OR c]的列表中查找所有单词，但不包含[s，z，n，m]中的任何一个

I figured out first part, but don't know how to include stop list: 我已经弄清楚了第一部分，但是不知道如何包括停止列表：

\w*[t|d]\w*[k|c]\w*

in Python notation 用Python表示法

Thank you in advance 先感谢您

Answer 1

You can use 2 steps. 您可以使用2个步骤。 First find t|d AND k|c, then filter out matches with unwanted letters. 首先找到t | d AND k | c，然后过滤掉不需要的字母的匹配项。

Since you said you figured out first part, here is the second: 由于您说的是第一部分，所以这里是第二部分：

matches = [i for i in matches if not re.search(r'[sznm]', i)]    
print(matches)

Answer 2

If you need the t or d appearing before k or c , use : [^sznm\\s\\d]*[td][^sznm\\s\\d]*[kc][^sznm\\s\\d]* . 如果需要在k or c之前出现t or d ，请使用： [^sznm\\s\\d]*[td][^sznm\\s\\d]*[kc][^sznm\\s\\d]* 。

[^sznm\\s\\d] means any character except z, n, m, s , whitespace characters ( \\s ) or numbers ( \\d ). [^sznm\\s\\d]表示除z, n, m, s ，空格字符（ \\s ）或数字（ \\d ）以外的任何字符。

Answer 3

s = "foobar foo".split()

allowed = ({"k", "c"}, {"r", "d"})
forbid = {"s","c","z","m"}

for word in s:
    if all(any(k in st for k in word) for st in allowed) and all(k not in forbid for k in word):
        print(word)

Or using a list comp with set.intersection: 或使用带有set.intersection的列表组合：

words = [word for word in s if all(st.intersection(word) for st in allowed) and not denied.intersection(word)]

Answer 4

Based on answer of Padraic 根据Padraic的回答

EDIT We both missed this condition 编辑我们都错过了这种情况

[t OR d] AND [k OR c] [t或d]和[k或c]

So - fixed accordingly 所以-相应地修复

s = "detected dot knight track"

allowed = ({"t","d"},{"k","c"})
forbidden = {"s","z","n", "m"}

for word in s.split():
    letter_set = set(word)
    if all(letter_set & a for a in allowed) and letter_set - forbidden == letter_set:
        print(word)

And the result is 结果是

detected
track

Answer 5

Use this code: 使用此代码：

import re
re.findall('[abcdefghijklopqrtuvwxy]*[td][abcdefghijklopqrtuvwxy]*[kc][abcdefghijklopqrtuvwxy]*', text)

Answer 6

I really like the answer by @padraic-cunningham that does not make use of re, but here is a pattern, which will work: 我真的很喜欢@ padraic-cunningham的答案，该答案没有使用re，但是这是一个可以使用的模式：

pattern = r'(?!\w*[sznm])(?=\w*[td])(?=\w*[kc])\w*'

Positive (?=...) and negative (?!...) lookahead assertions are well documented on python.org . 正(?=...)和负(?!...)前瞻断言在python.org上有很好的文档说明。

Answer 7

You need to use lookarounds. 您需要使用环顾四周。

^(?=.*[td])(?!.*[sznm])\w*[kc]\w*$

ie, 即

>>> l = ['fooktz', 'foocdm', 'foobar', 'kbard']
>>> [i for i in l if re.match(r'^(?=.*[td])(?!.*[sznm])\w*[kc]\w*$', i)]
['kbard']

正则表达式：查找所有带有某些字母但没有其他字母的单词

问题描述

7 个解决方案

解决方案1
2 2015-02-09 12:25:34

解决方案2
1 2015-02-09 12:26:55

解决方案3
1 2015-02-09 12:27:28

解决方案4
1 已采纳 2015-02-09 12:36:36

解决方案5
0 2015-02-09 12:32:35

解决方案6
0 2015-02-09 12:37:49

解决方案7
0 2015-02-09 12:43:07

正则表达式：查找所有带有某些字母但没有其他字母的单词

问题描述

7 个解决方案

解决方案1 2 2015-02-09 12:25:34

解决方案2 1 2015-02-09 12:26:55

解决方案3 1 2015-02-09 12:27:28

解决方案4 1 已采纳 2015-02-09 12:36:36

解决方案5 0 2015-02-09 12:32:35

解决方案6 0 2015-02-09 12:37:49

解决方案7 0 2015-02-09 12:43:07

解决方案1
2 2015-02-09 12:25:34

解决方案2
1 2015-02-09 12:26:55

解决方案3
1 2015-02-09 12:27:28

解决方案4
1 已采纳 2015-02-09 12:36:36

解决方案5
0 2015-02-09 12:32:35

解决方案6
0 2015-02-09 12:37:49

解决方案7
0 2015-02-09 12:43:07