I try to use MAWK where the match()
built-in function doesn't have a third value for variable:
match($1, /9f7fde/) {
substr($1, RSTART, RLENGTH);
}
See doc .
How can I store this output into a variable named var
when later I want to construct my output like this?
EDIT2 - Complete example:
Input file structure:
<iframe src="https://vimeo.com/191081157" frameborder="0" height="481" width="608" scrolling="no"></iframe>|Random title|Uploader|fun|tag1,tag2,tag3
<iframe src="https://vimeo.com/212192268" frameborder="0" height="481" width="608" scrolling="no"></iframe>|Random title|Uploader|fun|tag1,tag2,tag3
parser.awk:
{
Embed = $1;
Title = $2;
User = $3;
Categories = $4;
Tags = $5;
}
BEGIN {
FS="|";
}
# Regexp without pattern matching for testing purposes
match(Embed, /191081157/) {
Id = substr(Embed, RSTART, RLENGTH);
}
{
print Id"\t"Title"\t"User"\t"Categories"\t"Tags;
}
Expected output:
191081157|Random title|Uploader|fun|tag1,tag2,tag3
I want to call the Id
variable outside the match()
function.
MAWK version:
mawk 1.3.4 20160930
Copyright 2008-2015,2016, Thomas E. Dickey
Copyright 1991-1996,2014, Michael D. Brennan
random-funcs: srandom/random
regex-funcs: internal
compiled limits:
sprintf buffer 8192
maximum-integer 2147483647
The obvious answer would seem to be
match($1, /9f7fde/) { var = "9f7fde"; }
But more general would be:
match($1, /9f7fde/) { var = substr($1, RSTART, RLENGTH); }
let's say this line
.....vimeo.com/191081157" frameborder="0" height="481" width="608" scrolling="no">Random title|Uploader|fun|tag1,tag2,tag3
{mawk/mawk2/gawk} 'BEGIN { OFS = "";
FS = "(^.+vimeo[\056]com[\057]|[\042] frameborder.+[\057]iframe[>])" ;
} (NF < 4) || ($2 !~ /191081157/) { next } ( $1 = $1 )'
\\056 is the dot ( . ) \\057 is forward slash ( / ) and \\042 is double straight quote ( " )
if it can't even match at all, move onto next row. otherwise, use the power of the field separator to gobble away all the unneeded parts of the line. The $1 = $1 will collect the prefix and the rest of the HTML tags you don't need.
The assignment operation of $1 = $1 will also return true, providing the input for boolean evaluation for it to print. This way, you don't need either match( ) or substr( ) at all.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.