AWK 多次更改字段分隔符

Question

我在下面有以下示例代码； 为了便于测试，我将几个文件的文本合并为一个。 通常此脚本会使用find命令过滤每个子目录以查找versions.tf并在每个子目录上运行 AWK。

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "> 2.0.0"
    }
  }
  required_version = ">= 0.13"
}


terraform {
  required_providers {
    luminate = {
      source  = "terraform.example.com/nbs/luminate"
      version = "1.0.8"
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "2.40.0"
    }
    random = {
      source = "hashicorp/random"
    }
    template = {
      source = "hashicorp/template"
    }
  }
  required_version = ">= 0.13"
}

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">=2.38.0, < 3.0.0"
    }
    luminate = {
      source  = "terraform.example.com/nbs/luminate"
      version = "1.0.8"
    }
    random = {
      source  = "hashicorp/random"
      version = "3.0.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "3.0.0"
    }
  }
  required_version = ">= 0.13"
}

我原来的 AWK 脚本如下所示：

/^[[:space:]]{2,2}required_providers/,/^[[:space:]]{2,2}}$/ {
    gsub("\"", "")
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        pr = $1
    }
    if ($0 ~ /version[[:space:]]=[[:space:]]/) {
        printf("%s %s\n", pr, $3)
    }
}

这将打印出以下内容：

azurerm >           # note this
luminate 1.0.8
azurerm 2.40.0
azurerm >=2.38.0,   # note this
luminate 1.0.8
random 3.0.0
null 3.0.0

当人们向 repo 提交代码时， versions行通常不会在"之间包含空格，我会很好，但最近情况并非如此。因此我的脚本在上面提到的两行上搞砸了。

我在一本书中注意到我一直在阅读可以在脚本中多次更改字段分隔符的地方 ( https://www.packtpub.com/product/learning-awk-programming/9781788391030 )：

Now, to switch between two different FS, we can perform the following:

$ vi fs1.awk
{
if ($1 == "#entry")
{ FS=":"; }
else if ($1 == "#exit")
{ FS=" "; }
else
{ print $2 }
}

我已经在我的脚本中尝试过这个，但它不起作用。 我只能假设这是因为我试图在嵌套函数中执行切换？

/^[[:space:]]{2,2}required_providers/,/^[[:space:]]{2,2}}$/ {
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        FS = " "
        pr = $1
    }
    if ($0 ~ /version[[:space:]]=[[:space:]]/) {
        FS = "\""
        printf("%s %s\n", pr, $2)
    }
}

输出如下：

azurerm =
luminate =
azurerm =
azurerm =
luminate =
random =
null =

任何人都可以建议用于捕获/打印 output 的修复/解决方法，因此它看起来像：

azurerm > 2.0.0
luminate 1.0.8
azurerm 2.40.0
azurerm >=2.38.0, < 3.0.0
luminate 1.0.8
random 3.0.0
null 3.0.0

Answer 1

更改FS不会立即生效，请考虑如果file.txt内容是

1-2-3
4-5-6

然后

awk '(NR==1){FS="-"}{print NF}' file.txt

output

1
3

如您所见，从下一行开始应用了新的FS 。 如果您需要像FS那样在当前行中拆分，请考虑使用拆分function，例如对于与上述相同的文件输入

awk '{split($0,arr,"-");print arr[1],arr[2],arr[3]}' file.txt

output

1 2 3
4 5 6

（在 gawk 4.2.1 中测试）

Answer 2

一个更简单的解决方案是将您提取的值标准化。 您已经在使用正则表达式； 把它拉远一点。

/^[[:space:]]{2}required_providers/,/^[[:space:]]{2}}$/ {
    gsub("\"", "")
    if ($0 ~ /[[:alpha:]][[:space:]]=[[:space:]]\{/) {
        pr = $1
    }
    if ($0 ~ /version[[:space:]]*[<>=]+[[:space:]]*/) {
        ver = $0;
        sub(/^[[:space:]]*version[[:space:]]*(=[[:space:]]*)?/, "", ver);
        print pr, ver
    }
}

切线地，请注意我是如何放宽空格要求的，并将{2,2}替换为等效但更简洁的{2} 。

Answer 3

仅使用您显示的示例，您能否尝试以下操作。 在 GNU awk中编写和测试。

awk '
!NF{
  found1=found2=0
  val=""
}
/required_providers/{
  found1=1
  next
}
found1 && /^[[:space:]]+[[:alpha:]]+ = {/{
  sub(/^ +/,"",$1)
  val=$1
  found2=1
  next
} found2 && /version/{
  match($0,/".*"/)
  print val,substr($0,RSTART+1,RLENGTH-2)
  found2=0
}
'  Input_file

说明：为上文添加详细说明。

awk '                        ##Starting awk program from here.
!NF{                         ##checking condition if NF is NULL then do following.
  found1=found2=0            ##Setting found1 and found2 to 0 here.
  val=""                     ##Nullifying val here.
}
/required_providers/{        ##Checking if line has required_providers then do following.
  found1=1                   ##Setting found1 to 1 here.
  next                       ##next will skip all further statements from here.
}
found1 && /^[[:space:]]+[[:alpha:]]+ = {/{  ##Checking if found1 is set and line has spaces and alphabets followed by = { then do following.
  sub(/^ +/,"",$1)           ##Substituting initial spaces with NULL here in first field.
  val=$1                     ##Setting $1 to val here.
  found2=1                   ##Setting found2 here.
  next                       ##next will skip all further statements from here.
} found2 && /version/{       ##Checking condition if found2 is set and line has version in it.
  match($0,/".*"/)           ##Using match to match regex from " to till " here.
  print val,substr($0,RSTART+1,RLENGTH-2)  ##Printing val and sub string of matched values.
  found2=0                   ##Setting found2 to 0 here.
}
' Input_file                 ##Mentioning Input_file name here.

AWK 多次更改字段分隔符

问题描述

3 个解决方案

解决方案1
1 2021-02-18 08:19:05

解决方案2
1 已采纳 2021-02-18 08:56:12

解决方案3
1 2021-02-18 12:37:46

AWK 多次更改字段分隔符

问题描述

3 个解决方案

解决方案1 1 2021-02-18 08:19:05

解决方案2 1 已采纳 2021-02-18 08:56:12

解决方案3 1 2021-02-18 12:37:46

解决方案1
1 2021-02-18 08:19:05

解决方案2
1 已采纳 2021-02-18 08:56:12

解决方案3
1 2021-02-18 12:37:46