简体   繁体   中英

Bash script pattern matching

I need a to find patterns that are 6 digits and the first 3 digits are specific digits, but the remaining 3 digits will be any digit. For example, 6 digit strings starting with 123 followed by any 3 digits.

var1="abc,123111,"
var2="abcdefg,123222,"
var3="xyzabc,987111,"

if [[ $var1 == *",123ddd,"* ]] ; then echo "Pattern matched"; fi

Where ddd are any digits. var1 and var2 would match the pattern but var 3 would not. I can't seem to get it just right.

Use a character class: [0-9] matches 0 , 9 , and every character between them in the character set, which - at least in Unicode (eg UTF-8) and subset character sets (eg US-ASCII, Latin-1) - are the digits 1 through 8 . So it matches any one of the 10 Latin digits.

if [[ $var1 == *,123[0-9][0-9][0-9],* ]] ; then echo "Pattern matched"; fi

Using =~ instead of == changes the pattern type from shell standard "glob" patterns to regular expressions ("regexes" for short). You can make an equivalent regex a little shorter:

if [[ $var1 =~ ,123[0-9]{3}, ]] ; then echo "Pattern matched"; fi

The first shortening comes from the fact that a regex only has to match any part of the string, not the whole thing. Therefore you don't need the equivalent of the leading and trailing * s that you find in the glob pattern.

The second length reduction is due to the {n} syntax, which lets you specify an exact number of repetitions of the previous pattern instead of actually repeating the pattern itself in the regex. (You can also match any of a range of repetition counts by specifying a minimum and maximum, such as [0-9]{2,4} to match either two, three, or four digits in a row.)

It's worth noting that you could also use a named character class to match digits. Depending on your locale, [[:digit:]] may be exactly equivalent to [0-9] , or it may include characters from other scripts with the Unicode "Number, Decimal Digit" property.

if [[ $var1 =~ ,123[[:digit:]]{3}, ]] ; then echo "Pattern matched"; fi

Bash glob pattern matching [0-9] can be used to match digit:

if [[ $var1 == *,123[0-9][0-9][0-9],* ]] ; then echo "Pattern matched"; fi

Alternatively, you can use regex pattern matching with =~ :

if [[ $var1 =~ .*,123[0-9]{3},.* ]] ; then echo "Pattern matched"; fi

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM