Golang：为什么regexp.FindAllStringSubmatch（）返回[] []字符串而不是[] string？

Question

I am kind of new to go and that's the first time I have to deal with regexp. 我有点新手，这是我第一次处理regexp。

I am a bit surprised that the someregex.FindAllStringSubmatch("somestring", -1) returns a slice of slice [][]string instead of a simple slice of string : []string . 我感到有些惊讶， someregex.FindAllStringSubmatch("somestring", -1)返回一个slice [][]string的切片，而不是一个简单的string： []string切片。

example : 例如：

someRegex, _ := regexp.Compile("^.*(mes).*$")
matches := someRegex.FindAllStringSubmatch("somestring", -1)
fmt.Println(matches) // logs [[somestring mes]]

What is the reason of this behavior, I can't figure it out ? 这种行为的原因是什么，我无法弄清楚？

Answer 1

The func (*Regexp) FindAllStringSubmatch extracts matches and captured submatches. func (*Regexp) FindAllStringSubmatch提取匹配项和捕获的子匹配项。

A submatch is a part of the text that is matched by the regex part that is enclosed with a pair of unescaped parentheses (a so called capturing group ). 子匹配项是文本的一部分，由正则表达式部分匹配，该正则表达式部分用一对未转义的括号（所谓的捕获组 ）括起来。

In your case, ^.*(mes).*$ matches: 对于您的情况， ^.*(mes).*$匹配：

^ - start of string ^ -字符串开头
.* - any 0+ chars as many as possible .* -尽可能多的0个字符
(mes) - Capturing group 1 : a mes substring (mes) - 捕获组1 ： mes子字符串
.*$ - the rest of the string. .*$ -字符串的其余部分。

So, the match value is the whole string. 因此，匹配值是整个字符串。 It will be the first value in the output. 这将是输出中的第一个值。 Then, since there is a capturing group, there must be a place for it in the results, hence, mes is placed as the second item in the list. 然后，由于存在捕获组，因此结果中必须有一个位置，因此， mes将作为列表中的第二项放置。

Since there may be more matches than 1, we need a list of lists. 由于匹配项可能超过1，因此我们需要一个列表列表。

A better example may be the one with several match / submatch extraction (and maybe an optional group, too): 一个更好的示例可能是具有多个匹配/子匹配提取的示例（也可能是可选组）：

package main

import (
    "fmt"
    "regexp"
)

func main() {
    someRegex, _ := regexp.Compile(`[^aouiye]([aouiye])([^aouiye])?`)
    matches := someRegex.FindAllStringSubmatch("somestri", -1)
    fmt.Printf("%q\n", matches)
}

The [^aouiye]([aouiye])([^aouiye])? [^aouiye]([aouiye])([^aouiye])? matches a non-vowel, a vowel, and a non-vowel, capturing the last 2 into separate groups #1 and #2. 匹配一个非元音，一个元音和一个非元音，将最后2个捕获到单独的组＃1和＃2中。

The results are [["som" "o" "m"] ["ri" "i" ""]] . 结果是[["som" "o" "m"] ["ri" "i" ""]] 。 There are 2 matches, and each contains a match value, Group 1 value and Group 2 value. 有2个匹配项，每个匹配项包含一个匹配值，组1值和组2值。 Since the ri match has no text captured into Group 2 ( ([^aouiye])? ), it is empty, but it is still there since the group is defined in the regex pattern. 由于ri匹配没有捕获到第2组（ ([^aouiye])? ）中的文本，因此它为空，但是由于该组是在正则表达式模式中定义的，因此它仍然存在。

Answer 2

FindAllStringSubmatch is the 'All' version of FindStringSubmatch; FindAllStringSubmatch是FindStringSubmatch的“全部”版本； it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. 它返回表达式的所有连续匹配的一部分，如程序包注释中的“全部”描述所定义。 A return value of nil indicates no match. 返回值nil表示不匹配。

Docs . 文件。

To sum up: You need an array of arrays of strings, because this is the all version of FindStringSubmatch. 总结：您需要一个字符串数组数组，因为这是FindStringSubmatch的所有版本。 FindStringSubmatch will return a single string array. FindStringSubmatch将返回单个字符串数组。

Golang：为什么regexp.FindAllStringSubmatch（）返回[] []字符串而不是[] string？

问题描述

2 个解决方案

解决方案1
4 已采纳 2017-08-24 08:36:10

解决方案2
2 2017-08-24 08:32:52

Golang：为什么regexp.FindAllStringSubmatch（）返回[] []字符串而不是[] string？

问题描述

2 个解决方案

解决方案1 4 已采纳 2017-08-24 08:36:10

解决方案2 2 2017-08-24 08:32:52

解决方案1
4 已采纳 2017-08-24 08:36:10

解决方案2
2 2017-08-24 08:32:52