简体   繁体   中英

AWK change field separator multiple times

I have the following sample code below; for ease of testing I have combined the text of a few files into one. Usually this script would use the find command to filter through each subdirectory looking for versions.tf and run AWK on each one.

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "> 2.0.0"
    }
  }
  required_version = ">= 0.13"
}


terraform {
  required_providers {
    luminate = {
      source  = "terraform.example.com/nbs/luminate"
      version = "1.0.8"
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "2.40.0"
    }
    random = {
      source = "hashicorp/random"
    }
    template = {
      source = "hashicorp/template"
    }
  }
  required_version = ">= 0.13"
}

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">=2.38.0, < 3.0.0"
    }
    luminate = {
      source  = "terraform.example.com/nbs/luminate"
      version = "1.0.8"
    }
    random = {
      source  = "hashicorp/random"
      version = "3.0.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "3.0.0"
    }
  }
  required_version = ">= 0.13"
}

My original AWK script looked like this:

/^[[:space:]]{2,2}required_providers/,/^[[:space:]]{2,2}}$/ {
    gsub("\"", "")
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        pr = $1
    }
    if ($0 ~ /version[[:space:]]=[[:space:]]/) {
        printf("%s %s\n", pr, $3)
    }
}

Which would print out the following:

azurerm >           # note this
luminate 1.0.8
azurerm 2.40.0
azurerm >=2.38.0,   # note this
luminate 1.0.8
random 3.0.0
null 3.0.0

When people submitted code to the repo the versions line would normally not contain spaces inbetween the " and I would be fine, however this is tending not to be the case lately. So therefore my script messes up on the two lines noted above.

I noted in a book that I've been reading where I can change the Field Separator multiple times in a script ( https://www.packtpub.com/product/learning-awk-programming/9781788391030 ):

Now, to switch between two different FS, we can perform the following:

$ vi fs1.awk
{
if ($1 == "#entry")
{ FS=":"; }
else if ($1 == "#exit")
{ FS=" "; }
else
{ print $2 }
}

I have tried this in my script, but it doesn't work. I can only assume that it's because I'm trying to perform the switch in nested functions?

/^[[:space:]]{2,2}required_providers/,/^[[:space:]]{2,2}}$/ {
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        FS = " "
        pr = $1
    }
    if ($0 ~ /version[[:space:]]=[[:space:]]/) {
        FS = "\""
        printf("%s %s\n", pr, $2)
    }
}

Which outputs like:

azurerm =
luminate =
azurerm =
azurerm =
luminate =
random =
null =

Can anyone suggest a fix/workaround for capturing/printing the output so it looks like:

azurerm > 2.0.0
luminate 1.0.8
azurerm 2.40.0
azurerm >=2.38.0, < 3.0.0
luminate 1.0.8
random 3.0.0
null 3.0.0

Changing FS does not have immediate effect, consider that if file.txt content is

1-2-3
4-5-6

then

awk '(NR==1){FS="-"}{print NF}' file.txt

output

1
3

As you can see new FS was applied starting from next line. If you need to split in current line like FS would do consider using split function, for example for same file input as above

awk '{split($0,arr,"-");print arr[1],arr[2],arr[3]}' file.txt

output

1 2 3
4 5 6

(tested in gawk 4.2.1)

A much simpler solution is to just normalize the value you are pulling out. You are using a regex already; just stretch it a little bit further.

/^[[:space:]]{2}required_providers/,/^[[:space:]]{2}}$/ {
    gsub("\"", "")
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        pr = $1
    }
    if ($0 ~ /version[[:space:]]*[<>=]+[[:space:]]*/) {
        ver = $0;
        sub(/^[[:space:]]*version[[:space:]]*(=[[:space:]]*)?/, "", ver);
        print pr, ver
    }
}

Tangentially, notice how I relaxed the whitespace requirements, and replaced {2,2} with the equivalent but more succinct {2} .

With your shown samples only, could you please try following. Written and tested in GNU awk .

awk '
!NF{
  found1=found2=0
  val=""
}
/required_providers/{
  found1=1
  next
}
found1 && /^[[:space:]]+[[:alpha:]]+ = {/{
  sub(/^ +/,"",$1)
  val=$1
  found2=1
  next
} found2 && /version/{
  match($0,/".*"/)
  print val,substr($0,RSTART+1,RLENGTH-2)
  found2=0
}
'  Input_file

Explanation: Adding detailed explanation for above.

awk '                        ##Starting awk program from here.
!NF{                         ##checking condition if NF is NULL then do following.
  found1=found2=0            ##Setting found1 and found2 to 0 here.
  val=""                     ##Nullifying val here.
}
/required_providers/{        ##Checking if line has required_providers then do following.
  found1=1                   ##Setting found1 to 1 here.
  next                       ##next will skip all further statements from here.
}
found1 && /^[[:space:]]+[[:alpha:]]+ = {/{  ##Checking if found1 is set and line has spaces and alphabets followed by = { then do following.
  sub(/^ +/,"",$1)           ##Substituting initial spaces with NULL here in first field.
  val=$1                     ##Setting $1 to val here.
  found2=1                   ##Setting found2 here.
  next                       ##next will skip all further statements from here.
} found2 && /version/{       ##Checking condition if found2 is set and line has version in it.
  match($0,/".*"/)           ##Using match to match regex from " to till " here.
  print val,substr($0,RSTART+1,RLENGTH-2)  ##Printing val and sub string of matched values.
  found2=0                   ##Setting found2 to 0 here.
}
' Input_file                 ##Mentioning Input_file name here. 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM