I'm trying to extract an IP address from a text, and don't understand the outcome according to the regex I write. Apparantly this:
echo '"IPAddress": "173.14.0.3",' | sed -n -r -e 's/"IPAddress": "(.*)"/\1/p'
returns
173.14.0.3,
Why do I get the ,
at the end? Doesn't "(.*)"
instruct the regex to create a matching group of everything between the last two "
?
Originally I started out with
echo '"IPAddress": "173.14.0.3",' | sed -n -r -e 's/"IPAddress": "([0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3})"/\1/p'
, but got same result. I used regex101 , and there I get a different response. Why?
Your input:
input: "IPAddress": "173.14.0.3",
matched by regex: ^^^^^^^^^^^^^^^^^^^^^^^^^ (note: comma not matched)
captured: ^^^^^^^^^^
The matched part is replaced by the captured substring and substituted back into the original string, yielding:
result: 173.14.0.3,
not affected: ^
replacement: ^^^^^^^^^^
If you want to get rid of the comma, include it in the match (so it gets substituted by nothing):
s/"IPAddress": "(.*)",/\1/p
regex101 shows the same behavior: https://regex101.com/r/Fy5Lj3/4
General advice: regex101.com does not support the regular expression languages supported by sed
. These are explained in the POSIX
specification.
While simple things might look similar, expressions differ significantly in detail. Do not use regex101 when working on regular expressions for sed
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.