简体   繁体   English

awk命令问题识别分隔符

[英]awk command issue to recognize delimiter

Experts, any thoughts why delimiter not working in my case? 专家,为什么分隔符不适用于我的情况? The '^A' is a real '^A' string, not ASCII value 1. '^ A'是一个真正的'^ A'字符串,而不是ASCII值1。

cat 2.txt
123^A9343784^A2207983400
45^A1270843^A66789439
67^A188285^A28075164
8^A91183^A27049564
9^A128589^A7283486
100^A84325^A7043462

cat 2.txt | awk -F'^A' '{print $1 }'
123^A9343784^A2207983400
45^A1270843^A66789439
67^A188285^A28075164
8^A91183^A27049564
9^A128589^A7283486
100^A84325^A7043462

BTW, working on Mac OSX/Linux. 顺便说一句,在Mac OSX / Linux上工作。

thanks in advance, Lin 林先生,提前谢谢

EDIT 编辑

After some valid points made by Ed Morton in the comments area, I have updated my answer to provide slightly more insight on the different behavior of awk variants regarding escaping. Ed Morton在评论区域提出一些有效的观点之后,我更新了我的答案,以便更详细地了解awk变体在逃避方面的不同行为。


My understanding is that you want to use ^A as delimiter. 我的理解是你想使用^A作为分隔符。

You have to escape the ^ character, as it messes with awk's regex*. 你必须逃脱^字符,因为它与awk的正则表达式混淆*。 The way to do this, is by prepending the double escape sequence \\\\ to ^ 这样做的方法是将双转义序列\\\\^


- In Linux ( awk is usually symlinked to mawk or gawk , see NOTE): -Linux中awk通常mawk 链接mawkgawk ,请参阅注意):

$ cat 2.txt | awk -F'\\^A' '{print $1 }' # mawk, gawk

Now, mawk has a slightly more relaxed behavior on this, so it is possible to achieve the same results using only \\ (single escape): 现在, mawk在这mawk有一个稍微放松的行为,因此只使用\\ (单个转义)就可以实现相同的结果:

$ cat 2.txt | awk -F'\^A' '{print $1 }' # mawk (note the single backslash here)

however, in general, this should be avoided (especially if used in a script or as a passe partout one-liner -portability comes to mind-), since other awk variants will treat this differently and a variety of unwanted outcomes will occur (some even well-disguised as legitimate ones in complex situations) 然而,在一般情况下,这应该避免 (尤其是如果在脚本中使用或作为过时partout一个班轮-portability来mind-),因为其他的awk变种会将此不同,会发生多种不想要的结果(部分甚至在复杂情况下伪装成合法的人)


- In Windows ( cygwin , MinGW , gnutils provide gawk ): -Windows中cygwinMinGWgnutils提供gawk ):

$ cat 2.txt | awk -F'\\^A' '{print $1 }' # gawk

- In OSX ( awk is by default nawk ): -OSX中awk默认为nawk ):

$ cat 2.txt | awk -F'\\^A' '{print $1 }' # nawk

All these yield: 所有这些产量:

123
45
67
8
9
100

* You can find more information on awk's Regular Expressions here . * 您可以在 此处 找到有关awk正则表达式的更多信息


NOTE 注意

In order to find which variant of awk is available in your system, first you have to locate the awk command itself and then use ls to follow the link chain up to the actual binary, like this: 为了找到系统中可用的awk变体,首先必须找到awk命令本身,然后使用ls跟随链接到实际的二进制文件,如下所示:

$ which awk
/usr/bin/awk
$ ls -l /usr/bin/awk
lrwxrwxrwx 1 root root ... /usr/bin/awk -> /etc/alternatives/awk
$ ls -l /etc/alternatives/awk
lrwxrwxrwx 1 root root ... /etc/alternatives/awk -> /usr/bin/mawk

(example taken from my system, Xubuntu 14.04) (例子来自我的系统,Xubuntu 14.04)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM