简体   繁体   English

awk搜索多行记录文件的多个字段

[英]awk search on multiple fields of a multi line record file

I have a file with records that are of the form: 我有一个文件包含以下形式的记录:

SMS-MT-FSM-DEL-REP
country: IN
1280363645.979354_PFS_1_1887728354

SMS-MT-FSM-DEL-REP
country: IN
1280363645.729309_PFS_1_1084296392

SMS-MO-FSM
country: IR
1280105721.484103_PFM_1_1187616097

SMS-MO-FSM
country: MO
1280105721.461090_PFM_1_882824215

This lends itself to parsing via awk using something like: awk 'BEGIN { FS="\\n"; 这有助于通过awk解析使用类似于:awk'BEGIN {FS =“\\ n”; RS="" } /country:.*MO/ {print $0}' RS =“”} / country:。* MO / {print $ 0}'

My question is how do I use awk to search the records on 2 separate fields? 我的问题是如何使用awk搜索2个不同字段的记录? For example I only want to print out records that have a country of MO AND whos record first line is SMS-MO-FSM ? 例如,我只想打印出具有MO国家和第一行记录的记录是SMS-MO-FSM?

if you have set FS="\\n", and RS="", then the first field $1 would be SMS-MO-FSM. 如果你设置了FS =“\\ n”和RS =“”,那么第一个字段$ 1将是SMS-MO-FSM。 Therefore your awk code is 因此你的awk代码是

awk 'BEGIN{FS="\n"; RS=""} $2~/country.*MO/ && $1~/SMS-MO-FSM/ ' file

(I post this as a separate answer instead of a comment reply for better formatting) (我将此作为单独的答案而不是评论回复发布以获得更好的格式)

Concerning your second remark about printing a record on a single line: When you don't modify your records OFS and ORS have no effect. 关于你在单行上打印记录的第二句话:当你不修改你的记录OFSORS没有效果。 Only when you change $0 or one of the fields awk will recompute NF and reconstruct $0 based on $1 OFS $2 OFS ... $NF ORS . 只有当您更改$0或其中一个字段时, awk将重新计算NF并根据$1 OFS $2 OFS ... $NF ORS重建$0 You can force this reconstruction like this: 您可以强制执行此重建:

BEGIN {
    FS  = "\n"
    RS  = ""
    OFS = ";"     # Or another delimiter that does not appear in your data
    ORS = "\n"
}
$2 ~ /^[ \t]*country:[ \t]*MO[ \t]*$/ && $1 ~ /^[ \t]*SMS-MO-FSM[ \t]*$ {
    $1 = $1 ""    # This forces the reconstruction
    print
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM