简体   繁体   中英

Is it possible to use two different Field Separators in awk and store values from both in variables?

I guess the general question I have is, is it possible to give awk a field separator, store one of the tokens in a variable, then give awk another field separator, and store one of the tokens in a second variable, then print out both the variable values? It seems like the variables store a reference to the $nth token, not the value itself.

The specific example I had in mind more or less follows this form: {Animal}, {species} class

Cat, Felis catus MAMMAL
Dog, Canis lupus familiaris MAMMAL
Peregrine Falcon, Falco peregrinus AVIAN
...

and you want it to output something like:

Cat MAMMAL
Dog MAMMAL
Peregrine Falcon AVIAN
...

Where what you want is something that fits the form: {Animal} class

with something being enclosed in {}'s meaning it could have any number of spaces.

My original idea was I would have something like this:

cat test.txt | awk '{FS=","}; {animal=$1}; {FS=" "}; {class=$NF}; {print animal, class}; > animals.txt

I expect the variable "animal" to store what's to the left of the comma, and "class" to to have the class type of that animal, so MAMMAL, etc. But what ends up happening is that only the last used Field separator is applied, so this would break for things that have spaces in the name, like Peregrine Falcon, etc.

so it would look something like

Cat, MAMMAL
Dog, MAMMAL
Peregrine AVIAN

One way using awk :

awk -F, '{ n = split($2,array," "); printf "%s, %s\n", $1, array[n] }' file.txt

Results:

Cat, MAMMAL
Dog, MAMMAL
Peregrine Falcon, AVIAN

You can always split() inside your awk script. You can also manipulate fields causing the entire line to be re-parsed. For example, this gets the results in your question:

awk '{cl=$NF; split($0,a,", "); printf("%s, %s\n", a[1], cl)}' test.txt

The field separator for awk can be any regular expression, but in this case it might be easier to use the record separator, setting it to [,\\n] will alternate between the fields you want:

awk -v RS='[,\n]' 'NR % 2 { printf("%s, ", $0) } NR % 2 == 0 { print $NF }'

So even fields are output in their entirety, and odd fields only output the last field.

paste -d, <(cut -d, -f1 input.txt) <(awk '{print $NF}' input.txt)
  • cut the first column
  • awk get the last column
  • paste them together

output:

Cat,MAMMAL
Dog,MAMMAL
Peregrine Falcon,AVIAN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM