使用python regex剥离功能

Question

我是Python的初学者，并且刚刚了解了正则表达式。
我想做的是使用正则表达式方法制作一个带状特征（strip（））。
以下是我编写的代码，

import regex

stripRegex = regex.compile(r"(\s*)((\S*\s*\S)*)(\s*)")
text = '              Hello World This is me Speaking                                    '
check = stripRegex.search(text)
print(check)
print('group 1 :', stripRegex.search(text).group(1))
print('group 2 :', stripRegex.search(text).group(2))
print('group 3 :', stripRegex.search(text).group(3))
print('group 4 :', stripRegex.search(text).group(4))

结果是

第一组：
第二组：你好，世界，这是我说话
第3组：峰值
第4组：

在这里，我想知道两件事。
1）第3组如何返回“峰值”？
2）python是否按顺序识别'（'并分配最先出现的数字？
因此在此代码中，（\\ s *）（（\\ S * \\ s * \\ S） ）（\\ s ）
第一个（\\ s *）-是第一个组
（（\\ S * \\ s * \\ S） ）-第二个，
（\\ S \\ s * \\ S）-第三，
第二个（\\ s *）-第四个。

我对吗？

Answer 1

你是对的。 \\ S * \\ s * \\ S符合：

\S* - at least 0 non-whitespace
\s* - at least 0 whitespace
\S  - one non-whitespace

重复第3组（\\ S * \\ s * \\ S）喂入第2组（（\\ S * \\ s * \\ S）*），这样，第3组将包含它喂入第2组的最后一个匹配项： 0个或多个非空白最后跟0个或多个空白最后一个非空白的最后匹配是'tring'。 这可以用它的第一个匹配项来解释：

'Hello T'
\S* matches 'Hello'
\s* matches ' '
\S  matches 'T'

如果重复此操作，您将从每个单词的开头提取第一个字母：

'his i'
\S* matches 'his'
\s* matches ' '
\S  matches 'i'

依此类推，直到...

然后，最后一场比赛将省略最后一个单词的第一个字母，不需要任何空格，并且必须以一个非空格结尾：

'tring'
\S* matches 'trin'
\s* matches ''      (at least 0 whitespace, so zero)
\S  matches 'g'

Answer 2

Q2：你是对的。 从左到右，第一个(是组1的开始，第二个(是组2的开始，依此类推。

Q1：第3组重复匹配，因为前面带有* 。 它的最终值将是最终比赛的值。 第3组的比赛是：

"Hello W" where \S*="Hello"   \s*=" "   \S="W"
"orld T"  where \S*="orld"    \s*=" "   \S="T" 
"his i"   where \S*="his"     \s*=" "   \S="i"
"s m"     where \S*="s"       \s*=" "   \S="m"
"e S"     where \S*="e"       \s*=" "   \S="S"
"peaking" where \S*="peakin"  \s*=""    \S="g"

这是一个了解您的正则表达式的绝佳工具： https ://regex101.com/r/MmYOPT/1（尽管它对这种重复匹配没有多大帮助）。

使用python regex剥离功能

问题描述

2 个解决方案

解决方案1
1 2018-02-23 02:54:26

解决方案2
1 2018-02-23 03:17:37

使用python regex剥离功能

问题描述

2 个解决方案

解决方案1 1 2018-02-23 02:54:26

解决方案2 1 2018-02-23 03:17:37

解决方案1
1 2018-02-23 02:54:26

解决方案2
1 2018-02-23 03:17:37