简体   繁体   English

使用AWK正则表达式

[英]Working with AWK regex

I have a file in which have values in following format- 我有一个文件,其中包含以下格式的值 -

20/01/2012 01:14:27;UP;UserID;User=bob email=abc@sample.com

I want to pick each value from this file (not labels). 我想从这个文件中选择每个值(而不是标签)。 By saying label, i mean to say that for string email=abc@sample.com , i only want to pick abc@sample.com and for sting User=bob , i only want to pic bob . 说的标签,我的意思是说,对于字符串email=abc@sample.com ,我只想挑abc@sample.com和刺痛User=bob ,我只想PIC bob All the Space separated values are easy to pick but i am unable to pick the values separated by Semi colon. 所有Space分隔的值都很容易选择,但我无法选择由Semi冒号分隔的值。 Below is the command i am using in awk - 以下是我在awk使用的命令 -

awk '{print "1=",$1} /;/{print "2=",$2,"3=",$3}' sample_file

In $2 , i am getting the complete string till bob and rest of the string is assigned to $3 . $2 ,我得到完整的字符串,直到bob和字符串的其余部分分配到$3 Although i can work with substr provided with awk but i want to be on safe side, string length may vary. 虽然我可以使用awk提供的substr ,但我想要安全,字符串长度可能会有所不同。 Can somebody tell me how to design such regex to parse my file. 有人可以告诉我如何设计这样的regex来解析我的文件。

You can set multiple delimiters using awk -F : 您可以使用awk -F设置多个分隔符:

awk -F "[ \t;=]+" '{ print $1, $2, $3, $4, $5, $6, $7, $8 }' file.txt

Results: 结果:

value1 value2 value3 value4 label1 value5 label2 value6

EDIT: 编辑:

You can remove anything before the equal signs using sub (/[^=]*=/,"", $i) . 您可以使用sub (/[^=]*=/,"", $i)在等号前删除任何内容。 This will allow you to just print the 'values': 这将允许您只打印'值':

awk 'BEGIN { FS="[ \t;]+"; OFS=" " } { for (i=1; i<=NF; i++) { sub (/[^=]*=/,"", $i); line = (line ? line OFS : "") $i } print line; line = "" }' file.txt

Results: 结果:

20/01/2012 01:14:27 UP UserID bob abc@sample.com

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM