[英]awk output with spaces in first column
我尝试使用 awk 拆分列来打印一个句子,但第一列有空格。
我的初学者代码示例:
$ awk '/Linux/ { print "The filename","\""$1"\"","is located in",$2 }' test.txt
The filename "The" is located in test
The filename "Some" is located in file
The filename "File" is located in name
The filename "Something_here" is located in /ABC
The filename "Another_test" is located in /DEFG
The filename "Label" is located in test
来自文件:test.txt
Filename Folder Type
-------------------------------------- -------------- ------
The test file /test/folder Linux
Some file / Linux
File name /Temp Linux
Something_here /ABC Linux
Another_test /DEFG Linux
Label test /HIJK Linux
我想要实现的目标:(包括引号)
The filename "Default file" is located in /
The filename "The test file" is located in /test/folder
问题是当我使用“空格”或“/”作为分隔符时,我无法在打印时获得整行
我建议的sed基于正则表达式和反向引用加上一个取代的grep命令以消除源文件的标题行:
$ cat test.txt | grep -E 'Linux[ ]*$' | sed -E 's%(.+)([^ ])([ ]+)(/.+)[ ]+Linux[ ]*$%The filename "\1\2" is located in \4%'
The filename "The test file" is located in /test/folder
The filename "Some file" is located in /
The filename "File name" is located in /Temp
The filename "Something_here" is located in /ABC
The filename "Another_test" is located in /DEFG
The filename "Label test" is located in /HIJK
正则表达式 ( regex ) 的一个很好的参考是在Linux 手册中
评论中要求的详细描述:
这不是一个严格的解释,但至少它提供了一些输入以走得更远。
另一种解决方案是在sed命令行上传递多个操作。 因此,您可以添加一个查询来删除前 2 个标题行,以使用cat和grep抑制管道。 这里的“1,2d',意思是“删除第 1 行和第 2 行”:
$ sed -E '1,2d;s%(.+)([^ ])([ ]+)(/.+)[ ]+Linux[ ]*$%The filename "\1\2" is located in \4%' test.txt
The filename "The test file" is located in /test/folder
The filename "Some file" is located in /
The filename "File name" is located in /Temp
The filename "Something_here" is located in /ABC
The filename "Another_test" is located in /DEFG
The filename "Label test" is located in /HIJK
注意:根据手册, -E选项切换到使用扩展正则表达式。 GNU sed多年来一直支持它,现在已包含在 POSIX 中。 在较旧的系统上,如果不支持-E ,则可以使用-r :
$ sed -r '1,2d;s%(.+)([^ ])([ ]+)(/.+)[ ]+Linux[ ]*$%The filename "\1\2" is located in \4%' test.txt
The filename "The test file" is located in /test/folder
The filename "Some file" is located in /
The filename "File name" is located in /Temp
The filename "Something_here" is located in /ABC
The filename "Another_test" is located in /DEFG
The filename "Label test" is located in /HIJK
GNU awk 具有正则表达式字段分隔符,因此只需要多个空格分隔您的列。
awk '/Linux/ { print "The file \""$1"\" is in "$2"." }' FS=" *" test.txt
它还提供固定宽度的字段,比如info gawk fieldwidths
,您可以使用虚线的长度来动态设置它们。
如果你有 GNU AWK,这应该可以解决问题:
awk 'match($0, /([^\/]+)([^ ]+) *Linux/, arr) { sub(/ +$/, "", arr[1]); printf("The filename \"%s\" is located in %s\n", arr[1], arr[2]) }' test.txt
解释:
# match and store groups in 'arr'
# - arr[1]: everything up until the first slash (including a lot of whitespace)
# - arr[2]: first slash until space
# - rest: also ensure there's 'Linux' after that
match($0, /([^\/]+)([^ ]+) *Linux/, arr) {
# trim whitespace from the right hand side of the filename
sub(/ +$/, "", arr[1]);
# print
printf("The filename \"%s\" is located in %s\n", arr[1], arr[2])
}
请注意,在其他版本的 AWK 中还有一个功能较弱的match
版本,可以用这些版本实现相同的功能,但您必须编写更多代码。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.