简体   繁体   中英

bash: For each line in a txt file match a regex and save it to a variable array

I am trying to read each line of the text file and extract the name before the .tst and store each match into a variable array. here is an example of the txt file:

    someTest.tst (/blah/blah/blah),
    someOtherfile.tst (/some/other/blah),
    hello.tst (/not/the/same/blah),
    hi.tst (/is/this/blah),

There is a bunch of whitespace on each line before the characters.

I would like to extract the following values and store them in a variable array:

someTest
someOtherfile
hello
hi

I have tried using sed and awk but my knowledge with either is not expert level status and thus I am having trouble achieving what I want. Any insight?

You don't need a regex for this at all.

arr=( )
while read -r name _; do
  [[ $name = *.tst ]] || continue # skip lines not containing .tst
  arr+=( "${name%.tst}" )
done <input.txt

declare -p arr # print array contents
  • read accepts a list of destinations; fields (as determined by splitting input on the characters in IFS ) are populated into variables as they're read, and the last destination receives all remaining content on a line (including whitespace). Thus, read -r name _ puts the first field into name , and all remaining contents on the input line into a variable named _ .
  • [[ $name = *.tst ]] || continue [[ $name = *.tst ]] || continue skips all lines where the first field doesn't end in .tst .
  • "${name%.tst}" expands to the contents of "$name" , with the suffix .tst removed if present.
  • The while read; do ...; done <inputfile while read; do ...; done <inputfile while read; do ...; done <inputfile pattern is described in more detail in BashFAQ #1 .

However, if you wanted to use a regex, that might look like this:

re='^[[:space:]]*([^[:space:]]+)[.]tst[[:space:]]'

arr=( )
while IFS= read -r line; do
  [[ $line =~ $re ]] && arr+=( "${BASH_REMATCH[1]}" )
done <input.txt

declare -p arr # print array contents

Using [[ $string =~ $regex ]] evaluates $regex as an ERE and, if it matches, puts the entirety of the matched content into BASH_REMATCH[0] , and any match groups into BASH_REMATCH[1] and onward.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM