正则表达式（按顺序查找匹配的字符）

Question

Let us say that I have the following string variables: 让我们说我有以下字符串变量：

welcome = "StackExchange 2016"
string_to_find = "Sx2016"

Here, I want to find the string string_to_find inside welcome using regular expressions. 在这里，我想使用正则表达式在welcome内部找到字符串string_to_find 。 I want to see if each character in string_to_find comes in the same order as in welcome . 我想看看在每一个字符string_to_find进来的顺序相同welcome 。

For instance, this expression would evaluate to True since the 'S' comes before the 'x' in both strings, the 'x' before the '2' , the '2' before the 0 , and so forth. 例如，该表达式将计算结果为True ，因为'S'来了之前'x'两个字符串时， 'x'前的'2' ，在'2'之前的0 ，等等。

Is there a simple way to do this using regex? 有没有一种简单的方法使用正则表达式来做到这一点？

Answer 1

Your answer is rather trivial. 您的回答很简单。 The .* character combination matches 0 or more characters. .*字符组合匹配0个或更多字符。 For your purpose, you would put it between all characters in there. 为了您的目的，您可以将其放在其中的所有字符之间。 As in S.*x.*2.*0.*1.*6 . 与S.*x.*2.*0.*1.*6 。 If this pattern is matched, then the string obeys your condition. 如果此模式匹配，则字符串符合您的条件。

For a general string you would insert the .* pattern between characters, also taking care of escaping special characters like literal dots, stars etc. that may otherwise be interpreted by regex. 对于一般字符串，您应在字符之间插入.*模式，同时还要避免转义特殊字符（如文字点，星号等），否则这些特殊字符可能会被正则表达式解释。

Answer 2

Use wildcard matches with . 与一起使用通配符匹配. , repeating with * : ，用*重复：

expression = 'S.*x.*2.*0.*1.*6'

You can also assemble this expression with join() : 您也可以使用join()汇编此表达式：

expression = '.*'.join('Sx2016')

Or just find it without a regular expression, checking whether the location of each of string_to_find 's characters within welcome proceeds in ascending order, handling the case where a character in string_to_find is not present in welcome by catching the ValueError : 或者只是不使用正则表达式查找它，检查string_to_find中每个字符在welcome的位置是否以升序进行，通过捕获ValueError处理string_to_find中的字符不出现在welcome的情况：

>>> welcome = "StackExchange 2016"
>>> string_to_find = "Sx2016"
>>> try:
...     result = [welcome.index(c) for c in string_to_find]
... except ValueError:
...     result = None
...
>>> print(result and result == sorted(result))
True

Answer 3

This function might fit your need 此功能可能符合您的需求

import re
def check_string(text, pattern):
    return re.match('.*'.join(pattern), text)

'.*'.join(pattern) create a pattern with all you characters separated by '.*' . '.*'.join(pattern)创建一个模式，其中所有字符都由'.*'分隔。 For instance 例如

>> ".*".join("Sx2016")
'S.*x.*2.*0.*1.*6'

Answer 4

Actually having a sequence of chars like Sx2016 the pattern that best serve your purpose is a more specific: 实际上具有一系列字符（例如Sx2016 ，最能满足您目的的模式是更具体的：

S[^x]*x[^2]*2[^0]*0[^1]*1[^6]*6

You can obtain this kind of check defining a function like this: 您可以获取定义如下功能的检查：

import re
def contains_sequence(text, seq):
    pattern = seq[0] + ''.join(map(lambda c: '[^' + c + ']*' + c, list(seq[1:])))
    return re.search(pattern, text)

This approach add a layer of complexity but brings a couple of advantages as well: 这种方法增加了一层复杂性，但也带来了两个优点：

It's the fastest one because the regex engine walk down the string only once while the dot-star approach go till the end of the sequence and back each time a .* is used . 这是最快的一种方法，因为正则表达式引擎仅沿字符串走了一次，而点星方法一直走到序列的末尾， 每次使用.*都返回 。 Compare on the same string (~1k chars): 比较相同的字符串（约1k个字符）：
- Negated class -> 12 steps 否定课程 -> 12个步骤
- Dot star -> 4426 step 点星 -> 4426步
It works on multiline strings in input as well. 它也适用于输入中的多行字符串。

Example code 范例程式码

>>> sequence = 'Sx2016'
>>> inputs = ['StackExchange2015','StackExchange2016','Stack\nExchange\n2015','Stach\nExchange\n2016']
>>> map(lambda x: x + ': yes' if contains_sequence(x,sequence) else x + ': no', inputs)
['StackExchange2015: no', 'StackExchange2016: yes', 'Stack\nExchange\n2015: no', 'Stach\nExchange\n2016: yes']

正则表达式（按顺序查找匹配的字符）

问题描述

4 个解决方案

解决方案1
3 2016-07-15 08:43:27

解决方案2
1 2016-07-15 08:42:02

解决方案3
0 2016-07-15 08:48:37

解决方案4
0 2016-07-15 10:06:53

正则表达式（按顺序查找匹配的字符）

问题描述

4 个解决方案

解决方案1 3 2016-07-15 08:43:27

解决方案2 1 2016-07-15 08:42:02

解决方案3 0 2016-07-15 08:48:37

解决方案4 0 2016-07-15 10:06:53

解决方案1
3 2016-07-15 08:43:27

解决方案2
1 2016-07-15 08:42:02

解决方案3
0 2016-07-15 08:48:37

解决方案4
0 2016-07-15 10:06:53