简体   繁体   中英

AWK - add value based on regex

I have to add the numbers returned by REGEX using awk in linux.

Basically from this file:

123john456:x:98:98::/home/john123:/bin/bash

I have to add the numbers 123 and 456 using awk. So the result would be 579

So far I have done the following:

awk -F ':' '$1 ~ VAR+="/[0-9].*(?=:)/" ; {print VAR}' /etc/passwd

awk -F ':' 'VAR+="/[0-9].*(?=:)/" ; {print VAR}' /etc/passwd

awk -F ':' 'match($1, VAR=/[0-9].*?:/) ; {print VAR}' /etc/passwd

And from what I've seen match doesn't support this at all.

Does someone has any idea?

UPDATE: it also should work for john123 result - > 123 123john result - > 123

$ awk -F':' '{split($1,t,/[^0-9]+/); print t[1] + t[2]}' file
579

With your updated requirements:

$ cat file
123john456:x:98:98::/home/john123:/bin/bash
john123:x:98:98::/home/john123:/bin/bash
123john:x:98:98::/home/john123:/bin/bash

$ awk -F':' '{split($1,t,/[^0-9]+/); print t[1] + t[2]}' file
579
123
123

With gawk and for the given example

awk -F ':' '{a=gensub(/[a-zA-Z]+/,"+", "g", $1); print a}' inputFile | bc

would do the job. More general:

awk -F ':' '{a=gensub(/[a-zA-Z]+/,"+", "g", $1); a=gensub(/^+/,"","g",a); a=gensub(/+$/,"","g",a); print a}' inputFile | bc

The regex-part replaces all sequences of letters with '+' (eg, ' 12johnny34 ' becomes 12+34 ). Finally, this mathematical operation is evaluated by bc . (The be safe, I remove leading and trailing '+' sings by ^+ and +$ )

You can use [^0-9]+ as a field separator, and :[^\\n]*\\n as a record separator instead:

awk -F '[^0-9]+' 'BEGIN{RS=":[^\n]*\n"}{print $1+$2}' /etc/passwd

so that given the content of /etc/passwd being:

123john456:x:98:98::/home/john123:/bin/bash
john123:x:98:98::/home/john123:/bin/bash
123john:x:98:98::/home/john123:/bin/bash

This outputs:

579
123
123

I used awk's split() to separate the first field on any string not containing numbers.

split(string, target_array, [regex], [separator_array]*)

*separator_array requires gawk

$ awk -F: '{split($1, A, /[^0-9]+/, S); print S[1], A[1]+A[2]}' <<EOF
123john456:x:98:98::/home/john123:/bin/bash
123john:x:98:98::/home/john123:/bin/bash
EOF

john 579
john 123

You may use

awk -F ':' '{n=split($1, a, /[^0-9]+/); b=0; for (i=1;i<=n;i++) { b += a[i]; }; print b; }' /etc/passwd

See online awk demo

s="123john456:x:98:98::/home/john123:/bin/bash
john123:x:98:98::/home/john123:/bin/bash"
awk -F ':' '{n=split($1, a, /[^0-9]+/); b=0; for (i=1;i<=n;i++) { b += a[i]; }; print b; }' <<< "$s"

Output:

579
123

Details

  • -F ':' - records are split into fields with : char
  • n=split($1, a, /[^0-9]+/) - gets Field 1 and splits into digit only chunks saving the numbers in a array and the n var contains the number of these chunks
  • b=0 - b will hold the sum
  • for (i=1;i<=n;i++) { b += a[i]; } for (i=1;i<=n;i++) { b += a[i]; } - iterate over a array and sum the values
  • print b - prints the result.

Here is another awk variant that adds all the numbers present in first field separated by : :

cat file
123john456:x:98:98::/home/john123:/bin/bash
john123:x:98:98::/home/john123:/bin/bash
123john:x:98:98::/home/john123:/bin/bash
1j2o3h4n5:x:98:98::/home/john123:/bin/bash

awk -F '[^0-9:]+' '{s=0; for (i=1; i<=NF; i++) {s+=$i; if ($i~/:$/) break} print s}' file

579
123
123
15

You can try Perl also

$ cat johnny.txt
123john456:x:98:98::/home/john123:/bin/bash
john123:x:98:98::/home/john123:/bin/bash
123john:x:98:98::/home/john123:/bin/bash

$ perl -F: -lane ' $_=$F[0]; $sum+= $1 while(/(\d+)/g); print $sum; $sum=0 ' johnny.txt
579
123
123

$

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM