简体   繁体   English

如何将awk与多值定界符一起使用

[英]How to use awk with multivalue delimiter

How I can use is awk delimiter which contains multivalue: "#@$" 我如何使用包含多值的awk定界符:“#@ $”

I have file like this: Test1#@$Test2#@$Test3#@$Test4 I need to extract 'Test2'. 我有这样的文件:Test1#@ $ Test2#@ $ Test3#@ $ Test4我需要提取'Test2'。 After I execute this command: awk -F "#@$" '{print $2}' , nothing is displayed> 执行以下命令后: awk -F "#@$" '{print $2}' ,什么都不显示>

And after that awk -F "#@$" '{print $1}' i get full line 然后在那awk -F "#@$" '{print $1}'我得到整行

Any ideas? 有任何想法吗?

The issue you are having is that the field separator FS is considered to be a regular expression. 您遇到的问题是字段分隔符FS被视为正则表达式。 The <dollar>-character ( $ ) has a special meaning in regular expressions as it represents an anchor for the end-of-the-line. <dollar>-字符( $ )在正则表达式中具有特殊含义,因为它表示行尾的锚点。 The solution is to escape it twice as the <backslash>-escapes are interpreted twice; 解决方法是对它两次转义,因为<backslash> -escapes被解释了两次。 once in lexical processing of the string and once in processing the regular expression: 一次在字符串的词法处理中,一次在处理正则表达式中:

awk -F '#@\\$' '{print $1}'

An extended regular expression can be used to separate fields by assigning a string containing the expression to the built-in variable FS , either directly or as a consequence of using the -F sepstring option. 通过将包含表达式的字符串直接分配给内置变量FS或使用-F sepstring选项的结果,可以使用扩展的正则表达式来分隔字段。 The default value of the FS variable shall be a single <space>. FS变量的默认值应为单个<space>。 The following describes FS behaviour: 下面介绍FS行为:

  1. If FS is a null string, the behaviour is unspecified. 如果FS为空字符串,则行为未指定。
  2. If FS is a single character: 如果FS是单个字符:

    • If FS is <space>, skip leading and trailing <blank> and <newline> characters; 如果FS为<空格>,则跳过前导和尾随的<空白>和<换行符>; fields shall be delimited by sets of one or more <blank> or <newline> characters. 字段应由一组一个或多个<blank>或<newline>字符定界。
    • Otherwise, if FS is any other character c , fields shall be delimited by every single occurrence of c . 否则,如果FS是任何其他字符c ,则字段应由每次出现c来界定。
  3. Otherwise, the string value of FS shall be considered to be an extended regular expression . 否则, FS的字符串值应被视为扩展的正则表达式 Each occurrence of a sequence matching the extended regular expression shall delimit fields. 匹配扩展正则表达式的序列的每次出现都应定界字段。

source: POSIX awk standard 来源: POSIX awk标准


A <dollar-sign> ( $ ) outside a bracket expression shall anchor the expression or subexpression it ends to the end of a string; 括号表达式之外的<dollar-sign>( $ )应将表达式或子表达式的锚定到字符串的末尾; such an expression or subexpression can match only a sequence ending at the last character of a string. 这样的表达式或子表达式只能匹配以字符串的最后一个字符结尾的序列。 For example, the EREs ef$ and (ef$) match ef in the string abcdef , but fail to match in the string cdefab , and the ERE e$f is valid, but can never match because the f prevents the expression e$ from matching ending at the last character. 例如,ERE ef$(ef$)与字符串abcdef中的ef匹配,但与字符串cdefab中的ef匹配,并且ERE e$f有效,但由于f阻止了表达式e$来自匹配以最后一个字符结尾。

source: POSIX Extended Regular Expressions 来源: POSIX扩展正则表达式

Just wrap $ in brackets [] to remove its special significance 只需将$放在方括号[]中即可删除其特殊含义

> cat t1
Test1#@$Test2#@$Test3#@$Test4
> awk -F '#@[$]' '{print $2}' t1
Test2
> 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM