简体   繁体   English

AWK多字段分隔符和变量

[英]AWK Multiple Field Separators and Variables

I am trying to perform calculations in awk using fields whose numbers are passed in from the shell, as well as the last four fields 我正在尝试使用从外壳传递其数字的字段以及最后四个字段在awk中执行计算

eg I call my shell script like this 例如,我这样称呼我的shell脚本

./myProgram myFile.txt 1 2 3 4

Then within my shell script I want to use awk to refer to fields in a text file like this, specifically the last four fields. 然后在我的shell脚本中,我想使用awk来引用这样的文本文件中的字段,特别是最后四个字段。 $(NF-3) - $(NF) $(NF-3)-$(NF)

0000000022:trevor:736:1,2:3,4
0000000223:john:73:5,6:7,8
0000002224:eliza:54:9,8:7,6
0000022225:paul:22:5,4:3,2
0000222226:chris:0:1,2:3,4

So I can go through the fields, however when I do because there are two types of field separators it doesn't seem to work. 所以我可以遍历各个字段,但是当我这样做时,因为存在两种类型的字段分隔符,它似乎不起作用。

My shell script so far: 到目前为止,我的shell脚本是:

#! /usr/bin/env bash

file="$1"

awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "u1 =", $u1 }' $1
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "v1 =", $v1 }' $1
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "u2 =", $u2 }' $1
awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \ '{ print "v2 =", $v2 }' $1

echo "Argument #1 =" $2
echo "Argument #2 =" $3
echo "Argument #3 =" $4
echo "Argument #4 =" $5

This is the output I get from terminal: 这是我从终端获得的输出:

u1 = 1
u1 = 5
u1 = 9
u1 = 5
u1 = 1
v1 = awk: illegal field $(), name "v1"
 input record number 1, file database.txt
 source line number 1
u2 = awk: illegal field $(), name "u2"
 input record number 1, file database.txt
 source line number 1
v2 = awk: illegal field $(), name "v2"
 input record number 1, file database.txt
 source line number 1
Argument #1 = 1
Argument #2 = 2
Argument #3 = 3
Argument #4 = 4

When you use $N in awk, it will retrieve field N . 在awk中使用$N时,它将检索字段N You can use this in combination with passing arguments to awk as you have done to access a field number defined in a shell variable. 您可以将其与将参数传递给awk结合使用,就像访问shell变量中定义的字段编号一样。 The main issue would appear to be that you are passing variables that haven't been set in your script. 主要问题似乎是您传递的是脚本中未设置的变量。

In your example invocation of the script, you're not passing enough arguments for positional parameters $6 and above to be defined. 在示例脚本调用中,您没有传递足够的参数来定义位置参数$6及更高。 This is what is causing your error messages that look like illegal field $() , because v1 is an empty string, so you're attempting to get a field with no number. 这就是导致您的错误消息看起来像illegal field $()原因,因为v1是一个空字符串,因此您尝试获取没有数字的字段。

NF is a special variable in awk that contains the number of fields, so to access the last four fields, you can use $(NF-3) , $(NF-2) , $(NF-1) , and $NF . NF是awk中包含字段数的特殊变量,因此要访问最后四个字段,可以使用$(NF-3)$(NF-2)$(NF-1)$NF

There was a \\ before the awk command which wasn't doing anything useful, so I removed that as well. 在awk命令之前有一个\\ ,它没有做任何有用的事情,因此我也删除了它。

There are a couple of other issues with your code that are worth mentioning too. 您的代码还有其他几个问题也值得一提。 Quote your shell variables! 引用您的shell变量! This prevents issues with word splitting on more complex variables. 这样可以防止在更复杂的变量上出现单词拆分问题。 If your arguments are numbers with no spaces, this won't make any difference but it does no harm either and is a good practice to get into. 如果您的参数是没有空格的数字,则不会有任何区别,但这也没有害处,这是一个很好的习惯。 You've defined file , so I've used that instead of $1 . 您已经定义了file ,所以我用它代替$1

Combining those changes, we end up with something like this: 结合这些更改,我们最终得到如下结果:

awk -F'[:,]' -v u1="$2" -v v1="$3" -v u2="$4" -v v2="$5" '{ print "u1 =", u1 }' "$file"

Just about one line: 大约一行:

awk -F'[:,]' -v u1=$5 -v v1=$6 -v u2=$7 -v v2=$8 \\ '{ print "u1 =", $u1 }' $1

Here $5,$6,$7 and $8 are the bash positional parameters not awk field position. $ 5,$ 6,$ 7和$ 8是bash位置参数,而不是awk字段位置。 Here you have 5 parameters to your script according to your command line: 根据命令行,您在脚本中有5个参数:

./myProgram myFile.txt 1 2 3 4

$1 = myFile.txt
$2 = 1
$3 = 2
$4 = 3
$5 = 4
$6 = 
$7 =
$8 =

That's why awk alert your only on the call to $v1 as it's equivalent to $ and is not a field value. 这就是为什么awk仅在调用$v1提醒您的原因,因为它等效于$并且不是字段值。

If I understood properly your problem you wish to get the line where the 4 last paramters match thoose values: 如果我正确理解了您的问题,那么您希望获得最后4个匹配thoose值的行:

awk -F'[:,]' '{ print "u1=",$(NF-3),"v1=",$(NF-2),"u2=",$(NF-1),"v2=",$NF }' "$1"

NF being the number of field, minus 3 give the 4 field before the end. NF为字段数,负3表示末尾的4字段。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM