简体   繁体   English

AWK 多次更改字段分隔符

[英]AWK change field separator multiple times

I have the following sample code below;我在下面有以下示例代码; for ease of testing I have combined the text of a few files into one.为了便于测试,我将几个文件的文本合并为一个。 Usually this script would use the find command to filter through each subdirectory looking for versions.tf and run AWK on each one.通常此脚本会使用find命令过滤每个子目录以查找versions.tf并在每个子目录上运行 AWK。

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "> 2.0.0"
    }
  }
  required_version = ">= 0.13"
}


terraform {
  required_providers {
    luminate = {
      source  = "terraform.example.com/nbs/luminate"
      version = "1.0.8"
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "2.40.0"
    }
    random = {
      source = "hashicorp/random"
    }
    template = {
      source = "hashicorp/template"
    }
  }
  required_version = ">= 0.13"
}

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">=2.38.0, < 3.0.0"
    }
    luminate = {
      source  = "terraform.example.com/nbs/luminate"
      version = "1.0.8"
    }
    random = {
      source  = "hashicorp/random"
      version = "3.0.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "3.0.0"
    }
  }
  required_version = ">= 0.13"
}

My original AWK script looked like this:我原来的 AWK 脚本如下所示:

/^[[:space:]]{2,2}required_providers/,/^[[:space:]]{2,2}}$/ {
    gsub("\"", "")
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        pr = $1
    }
    if ($0 ~ /version[[:space:]]=[[:space:]]/) {
        printf("%s %s\n", pr, $3)
    }
}

Which would print out the following:这将打印出以下内容:

azurerm >           # note this
luminate 1.0.8
azurerm 2.40.0
azurerm >=2.38.0,   # note this
luminate 1.0.8
random 3.0.0
null 3.0.0

When people submitted code to the repo the versions line would normally not contain spaces inbetween the " and I would be fine, however this is tending not to be the case lately. So therefore my script messes up on the two lines noted above.当人们向 repo 提交代码时, versions行通常不会在"之间包含空格,我会很好,但最近情况并非如此。因此我的脚本在上面提到的两行上搞砸了。

I noted in a book that I've been reading where I can change the Field Separator multiple times in a script ( https://www.packtpub.com/product/learning-awk-programming/9781788391030 ):我在一本书中注意到我一直在阅读可以在脚本中多次更改字段分隔符的地方 ( https://www.packtpub.com/product/learning-awk-programming/9781788391030 ):

Now, to switch between two different FS, we can perform the following:

$ vi fs1.awk
{
if ($1 == "#entry")
{ FS=":"; }
else if ($1 == "#exit")
{ FS=" "; }
else
{ print $2 }
}

I have tried this in my script, but it doesn't work.我已经在我的脚本中尝试过这个,但它不起作用。 I can only assume that it's because I'm trying to perform the switch in nested functions?我只能假设这是因为我试图在嵌套函数中执行切换?

/^[[:space:]]{2,2}required_providers/,/^[[:space:]]{2,2}}$/ {
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        FS = " "
        pr = $1
    }
    if ($0 ~ /version[[:space:]]=[[:space:]]/) {
        FS = "\""
        printf("%s %s\n", pr, $2)
    }
}

Which outputs like:输出如下:

azurerm =
luminate =
azurerm =
azurerm =
luminate =
random =
null =

Can anyone suggest a fix/workaround for capturing/printing the output so it looks like:任何人都可以建议用于捕获/打印 output 的修复/解决方法,因此它看起来像:

azurerm > 2.0.0
luminate 1.0.8
azurerm 2.40.0
azurerm >=2.38.0, < 3.0.0
luminate 1.0.8
random 3.0.0
null 3.0.0

Changing FS does not have immediate effect, consider that if file.txt content is更改FS不会立即生效,请考虑如果file.txt内容是

1-2-3
4-5-6

then然后

awk '(NR==1){FS="-"}{print NF}' file.txt

output output

1
3

As you can see new FS was applied starting from next line.如您所见,从下一行开始应用了新的FS If you need to split in current line like FS would do consider using split function, for example for same file input as above如果您需要像FS那样在当前行中拆分,请考虑使用拆分function,例如对于与上述相同的文件输入

awk '{split($0,arr,"-");print arr[1],arr[2],arr[3]}' file.txt

output output

1 2 3
4 5 6

(tested in gawk 4.2.1) (在 gawk 4.2.1 中测试)

A much simpler solution is to just normalize the value you are pulling out.一个更简单的解决方案是将您提取的值标准化。 You are using a regex already;您已经在使用正则表达式; just stretch it a little bit further.把它拉远一点。

/^[[:space:]]{2}required_providers/,/^[[:space:]]{2}}$/ {
    gsub("\"", "")
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        pr = $1
    }
    if ($0 ~ /version[[:space:]]*[<>=]+[[:space:]]*/) {
        ver = $0;
        sub(/^[[:space:]]*version[[:space:]]*(=[[:space:]]*)?/, "", ver);
        print pr, ver
    }
}

Tangentially, notice how I relaxed the whitespace requirements, and replaced {2,2} with the equivalent but more succinct {2} .切线地,请注意我是如何放宽空格要求的,并将{2,2}替换为等效但更简洁的{2}

With your shown samples only, could you please try following.仅使用您显示的示例,您能否尝试以下操作。 Written and tested in GNU awk .在 GNU awk中编写和测试。

awk '
!NF{
  found1=found2=0
  val=""
}
/required_providers/{
  found1=1
  next
}
found1 && /^[[:space:]]+[[:alpha:]]+ = {/{
  sub(/^ +/,"",$1)
  val=$1
  found2=1
  next
} found2 && /version/{
  match($0,/".*"/)
  print val,substr($0,RSTART+1,RLENGTH-2)
  found2=0
}
'  Input_file

Explanation: Adding detailed explanation for above.说明:为上文添加详细说明。

awk '                        ##Starting awk program from here.
!NF{                         ##checking condition if NF is NULL then do following.
  found1=found2=0            ##Setting found1 and found2 to 0 here.
  val=""                     ##Nullifying val here.
}
/required_providers/{        ##Checking if line has required_providers then do following.
  found1=1                   ##Setting found1 to 1 here.
  next                       ##next will skip all further statements from here.
}
found1 && /^[[:space:]]+[[:alpha:]]+ = {/{  ##Checking if found1 is set and line has spaces and alphabets followed by = { then do following.
  sub(/^ +/,"",$1)           ##Substituting initial spaces with NULL here in first field.
  val=$1                     ##Setting $1 to val here.
  found2=1                   ##Setting found2 here.
  next                       ##next will skip all further statements from here.
} found2 && /version/{       ##Checking condition if found2 is set and line has version in it.
  match($0,/".*"/)           ##Using match to match regex from " to till " here.
  print val,substr($0,RSTART+1,RLENGTH-2)  ##Printing val and sub string of matched values.
  found2=0                   ##Setting found2 to 0 here.
}
' Input_file                 ##Mentioning Input_file name here. 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM