I've got some sample data in the following form and need to extract the email address from it:
from=<user@mail.com> (<-- note that this corresponds to $7)
...
...
Currently I'm using this:
awk '/from=<.*>/ {print $7}' mail.log
However, that is only finding the strings that match the regex expression.
When it comes to printing it out, it still prints out the whole thing (like in the first text box).
You can use gsub
to remove everything around <
and >
:
awk '{gsub(/(^[^<]*<|>.*$)/, "", $7)}1' file
The key point here is (^[^<]*<|>.*$)
, a regex that can be split in two blocks --> (A|B)
:
^[^<]*<
everything from the beginning of the field up to <
. >.*$
everything from >
up to the end of the field. $ cat a
1 2 3 4 5 6 from=<user@mail.com> 8
1 2 3 4 5 6 <user@mail.com> 8
$ awk '{gsub(/(^[^<]*<|>.*$)/, "", $7)}1' a
1 2 3 4 5 6 user@mail.com 8
1 2 3 4 5 6 user@mail.com 8
Warning: I'm told the regular awk
command (often found on non-linux systems) doesn't support this command:
awk '/from=<([^>]*)>/ { print gensub(/.*from=<([^>]*)>.*/, "\\1", "1");}' mail.log
The core of this is the gensub
command. Given a regex, it performs a substitution (by default, operating on the whole line, $0
), and returns the modified string. The substitute, in this case, is "\\1", which refers to the match group. So we find the whole line (with something special in the middle), then return just the special bit.
GNU grep can handle this nicely if you use a positive look behind :
$ grep -Po '(?<=from=<)[^>]*' file
user@mail.com
This will print anything between from=<
and >
in file
.
iiSeymour's answer is the simplest approach in this case, if you have GNU grep (as he states).
You could even simplify it a little with \\K
(which drops everything matched up to that point): grep -Po 'from=<\\K[^>]*' file
.
For those NOT using GNU grep (implementations without -P
for PCRE (Perl-Compatible Regular Expression) support), you can use the following pipeline, which is not the most efficient, but easy to understand:
grep -o 'from=<[^>]*' | cut -d\< -f2
-o
causes grep to only output the matched part of the input, which includes from=<
in this case. cut
command then prints the substring after the <
(the second field ( -f2
) based on delimiter <
( -d\\<
), , effectively printing the email address only.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.