简体   繁体   English

使用bash或python从文件中提取行

[英]extract the lines from file with bash or python

Here is my file content which is output of pflogsumm 这是我的文件内容,它是pflogsumm的输出

Host/Domain Summary: Messages Received 
---------------------------------------
 msg cnt   bytes   host/domain
 -------- -------  -----------
    415     5416k  abc.com
     13    19072   xyz.localdomain

Senders by message count
------------------------
    415   alert@example.com
     13   root@jelly.localdomain

Recipients by message count
---------------------------
    506   alert@apple.com            <= Extract from here to ...
     70   info@pafpro.org.us
     ..
     ...
     19   gems@gmail.com
     17   info@aol.com
     13   hemdem@gmail.com           <= Extract ends here

Senders by message size
-----------------------
   5416k  alert@google.com
...
 ...

The output seems to have the information feilds separated by "title" and a "new line". 输出似乎具有以“标题”和“换行”分隔的信息领域。 For example Recipients by message count ...<contents of interest> ... NewLine I tried with below sed expression but it returns all lines after matching the string "Recipients by message count" 例如, Recipients by message count ...<contents of interest> ... NewLine我尝试使用以下sed表达式,但在匹配字符串"Recipients by message count"后返回所有行

sed -nr '/.*Recipients by message count/,/\\n/ p'

Desired output: All emails under "Recipients by message count" 所需的输出: "Recipients by message count"下的所有电子邮件

Using awk: 使用awk:

awk '/Recipients by message count/{p=1}!$0{p=0}p' input_file

Will print the Recipients by message count block 将按邮件计数打印收件人

Breakdown: 分解:

/Recipients by message count/ {p=1} # When /pattern/ is matched set p = 1
!$0 {p=0}                           # When input line is empty set p = 0
p                                   # Print line if p is true, short for:
                                    # p { print $0 }
$ sed -n '/Recipients by message count/,/^\s*$/ p' data | sed -n '1!{2!{$!p}}'
    506   alert@apple.com            <= Extracter from here to ...
     70   info@pafpro.org.us
     ..
     ...
     19   gems@gmail.com
     17   info@aol.com
     13   hemdem@gmail.com           <= Extract ends here

Something like this: 像这样:

    findthis = "Recipients by message count"

    with open("tst.dat") as f:
      while True:
        line = f.readline()
        if not line: break

        if not findthis in line:
          continue
        line = f.readline()

        while True:
          line = f.readline()
          if not line: break
          line = line.rstrip()     ## get rid of whitespace
          if line == "":           ## empty line
            break
          print(line)

If the file is big or you have wildcard searches, use the regular expression library. 如果文件很大或您有通配符搜索,请使用正则表达式库。

Below script : 下面的脚本:

sed -n '/Recipients/{n;n;:loop;/^$/!{p;n;b loop};q}' filename

will do the job for you. 将为您完成这项工作。

Note : If the pattern of interest is at the very end, you require a trailing blank line. 注意:如果感兴趣的模式恰好在末尾,则需要尾随空白行。

An awk command, for the lines between "Recipients" and "Senders", if the line starts with a space, print it. awk命令,用于“收件人”和“发件人”之间的行,如果该行以空格开头,则将其打印出来。

[name@server ~]$ awk '/^Recipients/,/^Senders/ { if ($0~/^ /) print }' input.txt
    506   alert@apple.com            <= Extracter from here to ...
     70   info@pafpro.org.us
     ..
     ...
     19   gems@gmail.com
     17   info@aol.com
     13   hemdem@gmail.com           <= Extract ends here

另一只s一只内胆:

 sed '/Recipients by message count/,/^$/!d;//{N;d};' file

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM