grep 正则表达式 - 如何匹配相同的字符对？

Question

Say I have the following string:假设我有以下字符串：

blah blah blah \the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\ blah blah blah \ \\ \\ \\ \ foobar \ a\\b\\c\\ \

and I want to match the following 3 matches using grep:我想使用 grep 匹配以下 3 个匹配项：

\the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\

and和

\ \\ \\ \\ \

and和

\ a\\b\\c\\ \

to do this I need a way to pair '\\' as to only end the match when there is a single closing '\' that isn't part of a pair.为此，我需要一种配对 '\\' 的方法，以便仅在有一个不属于配对的单个结束 '\' 时才结束比赛。

so far I have this:到目前为止，我有这个：

echo $string | grep -oP '\\((?!\\).)*\\'

edit: I managed to get it working in the regex101 environment:编辑：我设法让它在 regex101 环境中工作：

\\((?!\\).|(([\\]{2})+))+\\

https://regex101.com/r/wC2cF1/13 https://regex101.com/r/wC2cF1/13

but it's still giving me the same result in grep perl但它仍然给我同样的结果 grep perl

Answer 1

Use利用

text='blah blah blah \the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\ blah blah blah \ \\ \\ \\ \ foobar \ a\\b\\c\\ \'
echo "$text" | grep -oE '\\([^\\]|\\\\)+\\'

Output: Output：

\the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\
\ \\ \\ \\ \
\ a\\b\\c\\ \

Answer 2

If you have GNU grep then @RyszardCzech's answer is a good solution, otherwise using any awk in any shell on every UNIX box: If you have GNU grep then @RyszardCzech's answer is a good solution, otherwise using any awk in any shell on every UNIX box:

$ cat tst.awk
{
    gsub(/\\\\/,RS)
    while ( match($0,/\\[^\\]*\\/) ) {
        tgt = substr($0,RSTART,RLENGTH)
        gsub(RS,"\\\\",tgt)
        print tgt
        $0 = substr($0,RSTART+RLENGTH)
    }
}

. .

$ awk -f tst.awk file
\the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\
\ \\ \\ \\ \
\ a\\b\\c\\ \

Answer 3

Using the core Text::Balanced module to extract the string:使用核心Text::Balanced模块提取字符串：

$ perl -MText::Balanced=extract_delimited -nE '$text = extract_delimited($_, q/\\/, qr/^[^\\]*/, q/\\/); say $text' input.txt
\the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\

Answer 4

Note: This solution is simpler and better than the answer below.注意：此解决方案比下面的答案更简单更好。 But beware that its behaviour is different on a string \\\xy\ , for example.但请注意，例如，它在字符串\\\xy\上的行为是不同的。

Using GNU utilities: 使用 GNU 实用程序：

 sed 's/\\\\/\x00/g' file | grep -ao '\\[^\\]*\\' | sed 's/\x00/\\\\/g'

The first sed replaces each double backslash ( \\ ) with a null character (highly unlikely to occur in the original data to be processed).第一个sed将每个双反斜杠 ( \\ ) 替换为 null 字符（极不可能出现在要处理的原始数据中）。
The grep captures and prints the characters between matching single backslashes ( \ ). grep捕获并打印匹配的单个反斜杠 ( \ ) 之间的字符。 The GNU specific -a option allows to process a binary file as if it were a text file since the stream may contain null characters at this point. GNU 特定的-a选项允许像处理文本文件一样处理二进制文件，因为此时 stream 可能包含 null 字符。 With the GNU specific -o option, grep prints only the matching parts of the line, each one on a separate output line.使用 GNU 特定的-o选项， grep仅打印该行的匹配部分，每个部分位于单独的 output 行上。
The last sed restores the double backslashes by replacing each null character with a \\ .最后一个sed通过用\\替换每个 null 字符来恢复双反斜杠。

Please notice that those are highly GNU specific.请注意，这些都是高度 GNU 特定的。

Test:测试：

 $ line='blah blah blah \the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\ blah blah blah \ \\ \\ \\ \ foobar \ a\\b\\c\\ \' $ sed 's/\\\\/\x00/g' <<< "$line" | grep -ao '\\[^\\]*\\' | sed 's/\x00/\\\\/g' \the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\ \ \\ \\ \\ \ \ a\\b\\c\\ \

Answer 5

With echo grep and tail...带回声 grep 和尾...

string='blah blah blah \the rain in sp\\\\ain moves mainly\\ on the p\\lain\\\\\ blah blah blah \ \\ \\ \\ \ foobar \ a\\b\\c\\ \'

echo ${string} | grep -o -E "([ \]{1,2}[ a-z]{0,2}[ \]{0,2}){1,4}" | tail -n2 | grep -o -E "[abc \]{1,32}"

Puts out...发出...

 \ \\ \\ \\ \ 
 \ a\\b\\c\\ \

grep -E means: Using an extended regular expression grep -E表示：使用扩展的正则表达式

grep 正则表达式 - 如何匹配相同的字符对？

问题描述

5 个解决方案

解决方案1
2 2020-07-31 19:35:37

解决方案2
2 2020-07-31 23:17:21

解决方案3
1 2020-07-31 16:33:27

解决方案4
0 已采纳 2020-07-31 17:52:27

解决方案5
0 2020-07-31 18:00:00

grep 正则表达式 - 如何匹配相同的字符对？

问题描述

5 个解决方案

解决方案1 2 2020-07-31 19:35:37

解决方案2 2 2020-07-31 23:17:21

解决方案3 1 2020-07-31 16:33:27

解决方案4 0 已采纳 2020-07-31 17:52:27

解决方案5 0 2020-07-31 18:00:00

解决方案1
2 2020-07-31 19:35:37

解决方案2
2 2020-07-31 23:17:21

解决方案3
1 2020-07-31 16:33:27

解决方案4
0 已采纳 2020-07-31 17:52:27

解决方案5
0 2020-07-31 18:00:00