[英]How to use awk with multivalue delimiter
How I can use is awk
delimiter which contains multivalue: "#@$" 我如何使用包含多值的awk
定界符:“#@ $”
I have file like this: Test1#@$Test2#@$Test3#@$Test4 I need to extract 'Test2'. 我有这样的文件:Test1#@ $ Test2#@ $ Test3#@ $ Test4我需要提取'Test2'。 After I execute this command: awk -F "#@$" '{print $2}'
, nothing is displayed> 执行以下命令后: awk -F "#@$" '{print $2}'
,什么都不显示>
And after that awk -F "#@$" '{print $1}'
i get full line 然后在那awk -F "#@$" '{print $1}'
我得到整行
Any ideas? 有任何想法吗?
The issue you are having is that the field separator FS
is considered to be a regular expression. 您遇到的问题是字段分隔符FS
被视为正则表达式。 The <dollar>-character ( $
) has a special meaning in regular expressions as it represents an anchor for the end-of-the-line. <dollar>-字符( $
)在正则表达式中具有特殊含义,因为它表示行尾的锚点。 The solution is to escape it twice as the <backslash>-escapes are interpreted twice; 解决方法是对它两次转义,因为<backslash> -escapes被解释了两次。 once in lexical processing of the string and once in processing the regular expression: 一次在字符串的词法处理中,一次在处理正则表达式中:
awk -F '#@\\$' '{print $1}'
An extended regular expression can be used to separate fields by assigning a string containing the expression to the built-in variable
FS
, either directly or as a consequence of using the-F
sepstring option. 通过将包含表达式的字符串直接分配给内置变量FS
或使用-F
sepstring选项的结果,可以使用扩展的正则表达式来分隔字段。 The default value of theFS
variable shall be a single <space>.FS
变量的默认值应为单个<space>。 The following describesFS
behaviour: 下面介绍FS
行为:
- If
FS
is a null string, the behaviour is unspecified. 如果FS
为空字符串,则行为未指定。If
FS
is a single character: 如果FS
是单个字符:
- If
FS
is <space>, skip leading and trailing <blank> and <newline> characters; 如果FS
为<空格>,则跳过前导和尾随的<空白>和<换行符>; fields shall be delimited by sets of one or more <blank> or <newline> characters. 字段应由一组一个或多个<blank>或<newline>字符定界。- Otherwise, if
FS
is any other characterc
, fields shall be delimited by every single occurrence ofc
. 否则,如果FS
是任何其他字符c
,则字段应由每次出现c
来界定。Otherwise, the string value of
FS
shall be considered to be an extended regular expression . 否则,FS
的字符串值应被视为扩展的正则表达式 。 Each occurrence of a sequence matching the extended regular expression shall delimit fields. 匹配扩展正则表达式的序列的每次出现都应定界字段。source: POSIX awk standard 来源: POSIX awk标准
A <dollar-sign> (
$
) outside a bracket expression shall anchor the expression or subexpression it ends to the end of a string; 括号表达式之外的<dollar-sign>($
)应将表达式或子表达式的锚定到字符串的末尾; such an expression or subexpression can match only a sequence ending at the last character of a string. 这样的表达式或子表达式只能匹配以字符串的最后一个字符结尾的序列。 For example, the EREsef$
and(ef$)
matchef
in the stringabcdef
, but fail to match in the stringcdefab
, and the EREe$f
is valid, but can never match because thef
prevents the expressione$
from matching ending at the last character. 例如,EREef$
和(ef$)
与字符串abcdef
中的ef
匹配,但与字符串cdefab
中的ef
匹配,并且EREe$f
有效,但由于f
阻止了表达式e$
来自匹配以最后一个字符结尾。source: POSIX Extended Regular Expressions 来源: POSIX扩展正则表达式
Just wrap $ in brackets [] to remove its special significance 只需将$放在方括号[]中即可删除其特殊含义
> cat t1
Test1#@$Test2#@$Test3#@$Test4
> awk -F '#@[$]' '{print $2}' t1
Test2
>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.