[英]Golang: Why does regexp.FindAllStringSubmatch() returns [][]string and not []string?
I am kind of new to go and that's the first time I have to deal with regexp. 我有点新手,这是我第一次处理regexp。
I am a bit surprised that the someregex.FindAllStringSubmatch("somestring", -1)
returns a slice of slice [][]string
instead of a simple slice of string : []string
. 我感到有些惊讶, someregex.FindAllStringSubmatch("somestring", -1)
返回一个slice [][]string
的切片,而不是一个简单的string: []string
切片。
example : 例如:
someRegex, _ := regexp.Compile("^.*(mes).*$")
matches := someRegex.FindAllStringSubmatch("somestring", -1)
fmt.Println(matches) // logs [[somestring mes]]
What is the reason of this behavior, I can't figure it out ? 这种行为的原因是什么,我无法弄清楚?
The func (*Regexp) FindAllStringSubmatch
extracts matches and captured submatches. func (*Regexp) FindAllStringSubmatch
提取匹配项和捕获的子匹配项。
A submatch is a part of the text that is matched by the regex part that is enclosed with a pair of unescaped parentheses (a so called capturing group ). 子匹配项是文本的一部分,由正则表达式部分匹配,该正则表达式部分用一对未转义的括号(所谓的捕获组 )括起来。
In your case, ^.*(mes).*$
matches: 对于您的情况, ^.*(mes).*$
匹配:
^
- start of string ^
-字符串开头 .*
- any 0+ chars as many as possible .*
-尽可能多的0个字符 (mes)
- Capturing group 1 : a mes
substring (mes)
- 捕获组1 : mes
子字符串 .*$
- the rest of the string. .*$
-字符串的其余部分。 So, the match value is the whole string. 因此,匹配值是整个字符串。 It will be the first value in the output. 这将是输出中的第一个值。 Then, since there is a capturing group, there must be a place for it in the results, hence, mes
is placed as the second item in the list. 然后,由于存在捕获组,因此结果中必须有一个位置,因此, mes
将作为列表中的第二项放置。
Since there may be more matches than 1, we need a list of lists. 由于匹配项可能超过1,因此我们需要一个列表列表。
A better example may be the one with several match / submatch extraction (and maybe an optional group, too): 一个更好的示例可能是具有多个匹配/子匹配提取的示例(也可能是可选组):
package main
import (
"fmt"
"regexp"
)
func main() {
someRegex, _ := regexp.Compile(`[^aouiye]([aouiye])([^aouiye])?`)
matches := someRegex.FindAllStringSubmatch("somestri", -1)
fmt.Printf("%q\n", matches)
}
The [^aouiye]([aouiye])([^aouiye])?
[^aouiye]([aouiye])([^aouiye])?
matches a non-vowel, a vowel, and a non-vowel, capturing the last 2 into separate groups #1 and #2. 匹配一个非元音,一个元音和一个非元音,将最后2个捕获到单独的组#1和#2中。
The results are [["som" "o" "m"] ["ri" "i" ""]]
. 结果是[["som" "o" "m"] ["ri" "i" ""]]
。 There are 2 matches, and each contains a match value, Group 1 value and Group 2 value. 有2个匹配项,每个匹配项包含一个匹配值,组1值和组2值。 Since the ri
match has no text captured into Group 2 ( ([^aouiye])?
), it is empty, but it is still there since the group is defined in the regex pattern. 由于ri
匹配没有捕获到第2组( ([^aouiye])?
)中的文本,因此它为空,但是由于该组是在正则表达式模式中定义的,因此它仍然存在。
FindAllStringSubmatch is the 'All' version of FindStringSubmatch; FindAllStringSubmatch是FindStringSubmatch的“全部”版本; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. 它返回表达式的所有连续匹配的一部分,如程序包注释中的“全部”描述所定义。 A return value of nil indicates no match. 返回值nil表示不匹配。
To sum up: You need an array of arrays of strings, because this is the all version of FindStringSubmatch. 总结:您需要一个字符串数组数组,因为这是FindStringSubmatch的所有版本。 FindStringSubmatch will return a single string array. FindStringSubmatch将返回单个字符串数组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.