正则表达式：在两个单引号之间获取第二个字符串

Question

Can I get a little help matching a string in the below text?我可以得到一些帮助来匹配下面文本中的字符串吗？

The default username and password is 'user' and 'ZWiliWH8E2mV'.默认用户名和密码是“user”和“ZWiliWH8E2mV”。

I'm trying to get the string between the second set of single quotes: ZWiliWH8E2mV.我正在尝试获取第二组单引号之间的字符串：ZWiliWH8E2mV。 This string is randomly generated, and I can only rely on the formatting, and not the ZWiliWH8E2mV.这个字符串是随机生成的，我只能依赖格式，而不是 ZWiliWH8E2mV。 After some googling, I can match it with grep:经过一番谷歌搜索后，我可以将其与 grep 相匹配：

cat file_name | grep -oP "(?<=').*?(?=')"

but it's the 3rd match, and I'm not sure how to get to it from there.但这是第 3 场比赛，我不确定如何从那里开始。 I'm open to using other tools if they're better for what I'm trying to do, but I'm not very versed in them.如果其他工具更适合我正在尝试做的事情，我愿意使用它们，但我不是很精通它们。

Answer 1

As you stated in the question states that you are trying to get the string between the second set of single quotes, you could match the first 3 single quotes and start the match after it until the occurrence of the fourth single quote.正如您在问题中所述，您正在尝试获取第二组单引号之间的字符串，您可以匹配前 3 个单引号并在它之后开始匹配，直到出现第四个单引号。

The negated character class [^']+ matches any char except a single quote. 否定字符 class [^']+匹配除单引号之外的任何字符。

^(?:[^']+'){3}\K[^']+(?=')

Explanation解释

^ Start of string ^字符串开始
?:[^']+'){3}' Match 3 times any char except ' then match ' ?:[^']+'){3}'匹配任何字符 3 次，除了' then match '
\K Clear the match buffer (Forget what is matches until this point) \K清除匹配缓冲区（直到此时忘记匹配的是什么）
[^']+ Match 1+ times any char except ' (What you want to match) [^']+匹配 1+ 次除'之外的任何字符（您要匹配的内容）
(?=') Positive lookahead, assert what is directly to the right is a ' (?=')正面前瞻，断言直接在右边的是'

Regex demo |正则表达式演示| Bash demo Bash演示

The updated code might look like更新后的代码可能看起来像

cat file_name | grep -oP "^(?:[^']+'){3}\K[^']+(?=')"

Answer 2

I'm trying to get the string between the second set of single quotes我正在尝试获取第二组单引号之间的字符串

Using awk, you can avoid regex:使用 awk，可以避免正则表达式：

s="The default username and password is 'user' and 'ZWiliWH8E2mV'."

awk -F "'" '{print $4}' <<< "$s"

ZWiliWH8E2mV

Here we are using ' as field delimiter and 4th field in awk will give us 2nd value wrapped inside single quotes.这里我们使用'作为字段分隔符， awk中的第 4 个字段将为我们提供包含在单引号内的第 2 个值。

Answer 3

You may grab the value between the last two single quotation marks using grep :您可以使用grep获取最后两个单引号之间的值：

grep -oP ".*'\\K[^']+(?=')" file_name

See the online demo查看在线演示

The -o option outputs only matched substrings and P makes grep use PCRE regex engine. -o选项仅输出匹配的子字符串， P使grep使用 PCRE 正则表达式引擎。

PCRE regex details PCRE 正则表达式详细信息

.* - any 0 or more chars other than line break chars, as many as possible .* - 除换行字符外的任何 0 个或多个字符，尽可能多
' - a ' char ' - 一个'字符
\K - match reset operator that discards all text matched so far in the overall match memory buffer \K - 匹配重置运算符，丢弃到目前为止在整体匹配 memory 缓冲区中匹配的所有文本
[^']+ - one or more chars other than a ' char [^']+ - 除'字符外的一个或多个字符
(?=') - a positive lookahead that makes sure there is a ' char immidiately to the right of the current location. (?=') - 确保当前位置右边有一个'字符的正向前瞻。

Answer 4

If you have multiple single quoted fields:如果您有多个单引号字段：

$ s="'first' and 'second' and 'third' and 'fourth' and the rest"

You can use the following Perl one liner to get the nth field:您可以使用以下 Perl 一行来获取第nth字段：

echo "$s" |
perl -lne 'while (/[\x27]([^\x27]*)[\x27]/g) {print $1 if ++$i==3}'

# third

So for your example, the password is the second quoted field:因此，对于您的示例，密码是第二个引用的字段：

echo "The default username and password is 'user' and 'ZWiliWH8E2mV'." |
perl -lne 'while (/[\x27]([^\x27]*)[\x27]/g) {print $1 if ++$i==2}'

Prints:印刷：

ZWiliWH8E2mV

You can also use gawk with FPAT set to the same regex to print the nth field:您还可以使用gawk并将FPAT设置为相同的正则表达式来打印第 n 个字段：

s="'first' and 'second' and 'third' and 'fourth' and the rest"

echo "$s" |
gawk -v n=2 'BEGIN{FPAT="[\x27][^\x27]*[\x27]"} 
            { gsub(/[\x27]/,"",$n); print $n}'

# second

Or you can use a pipeline of two GNU sed commands with n being the line you print in the second sed :或者您可以使用两个 GNU sed 命令的管道，其中n是您在第二个sed中打印的行：

echo "$s" |
gsed -E 's/[^\x27]*\x27([^\x27]*)\x27[^\x27]*/\1\n/g' | gsed -nE '4p'
# fourth

Note:笔记：

[\x27] is the hex character representation for ' . [\x27]是'的十六进制字符表示。 Hex character representations are supported by most regex implementations but not all.大多数正则表达式实现都支持十六进制字符表示，但不是全部。 POSIX sed for example is dodgy.例如 POSIX sed是狡猾的。

正则表达式：在两个单引号之间获取第二个字符串

问题描述

4 个解决方案

解决方案1
2 2020-08-21 14:01:46

解决方案2
1 2020-08-21 14:07:22

解决方案3
1 2020-08-21 15:00:40

解决方案4
0 2020-08-21 15:10:10

正则表达式：在两个单引号之间获取第二个字符串

问题描述

4 个解决方案

解决方案1 2 2020-08-21 14:01:46

解决方案2 1 2020-08-21 14:07:22

解决方案3 1 2020-08-21 15:00:40

解决方案4 0 2020-08-21 15:10:10

解决方案1
2 2020-08-21 14:01:46

解决方案2
1 2020-08-21 14:07:22

解决方案3
1 2020-08-21 15:00:40

解决方案4
0 2020-08-21 15:10:10