使用bash或python从文件中提取行

Question

Here is my file content which is output of pflogsumm 这是我的文件内容，它是pflogsumm的输出

Host/Domain Summary: Messages Received 
---------------------------------------
 msg cnt   bytes   host/domain
 -------- -------  -----------
    415     5416k  abc.com
     13    19072   xyz.localdomain

Senders by message count
------------------------
    415   alert@example.com
     13   root@jelly.localdomain

Recipients by message count
---------------------------
    506   alert@apple.com            <= Extract from here to ...
     70   info@pafpro.org.us
     ..
     ...
     19   gems@gmail.com
     17   info@aol.com
     13   hemdem@gmail.com           <= Extract ends here

Senders by message size
-----------------------
   5416k  alert@google.com
...
 ...

The output seems to have the information feilds separated by "title" and a "new line". 输出似乎具有以“标题”和“换行”分隔的信息领域。 For example Recipients by message count ...<contents of interest> ... NewLine I tried with below sed expression but it returns all lines after matching the string "Recipients by message count" 例如， Recipients by message count ...<contents of interest> ... NewLine我尝试使用以下sed表达式，但在匹配字符串"Recipients by message count"后返回所有行

sed -nr '/.*Recipients by message count/,/\\n/ p'

Desired output: All emails under "Recipients by message count" 所需的输出： "Recipients by message count"下的所有电子邮件

Answer 1

Using awk: 使用awk：

awk '/Recipients by message count/{p=1}!$0{p=0}p' input_file

Will print the Recipients by message count block 将按邮件计数打印收件人

Breakdown: 分解：

/Recipients by message count/ {p=1} # When /pattern/ is matched set p = 1
!$0 {p=0}                           # When input line is empty set p = 0
p                                   # Print line if p is true, short for:
                                    # p { print $0 }

Answer 2

$ sed -n '/Recipients by message count/,/^\s*$/ p' data | sed -n '1!{2!{$!p}}'
    506   alert@apple.com            <= Extracter from here to ...
     70   info@pafpro.org.us
     ..
     ...
     19   gems@gmail.com
     17   info@aol.com
     13   hemdem@gmail.com           <= Extract ends here

Answer 3

Something like this: 像这样：

    findthis = "Recipients by message count"

    with open("tst.dat") as f:
      while True:
        line = f.readline()
        if not line: break

        if not findthis in line:
          continue
        line = f.readline()

        while True:
          line = f.readline()
          if not line: break
          line = line.rstrip()     ## get rid of whitespace
          if line == "":           ## empty line
            break
          print(line)

If the file is big or you have wildcard searches, use the regular expression library. 如果文件很大或您有通配符搜索，请使用正则表达式库。

Answer 4

Below script : 下面的脚本：

sed -n '/Recipients/{n;n;:loop;/^$/!{p;n;b loop};q}' filename

will do the job for you. 将为您完成这项工作。

Note : If the pattern of interest is at the very end, you require a trailing blank line. 注意：如果感兴趣的模式恰好在末尾，则需要尾随空白行。

Answer 5

An awk command, for the lines between "Recipients" and "Senders", if the line starts with a space, print it. awk命令，用于“收件人”和“发件人”之间的行，如果该行以空格开头，则将其打印出来。

[name@server ~]$ awk '/^Recipients/,/^Senders/ { if ($0~/^ /) print }' input.txt
    506   alert@apple.com            <= Extracter from here to ...
     70   info@pafpro.org.us
     ..
     ...
     19   gems@gmail.com
     17   info@aol.com
     13   hemdem@gmail.com           <= Extract ends here

Answer 6

另一只s一只内胆：

 sed '/Recipients by message count/,/^$/!d;//{N;d};' file

使用bash或python从文件中提取行

问题描述

6 个解决方案

解决方案1
4 已采纳 2016-05-12 06:32:29

解决方案2
2 2016-05-12 06:33:19

解决方案3
1 2016-05-12 06:40:06

解决方案4
1 2016-05-12 07:23:08

解决方案5
0 2016-05-12 06:56:14

解决方案6
0 2016-05-12 08:41:02

使用bash或python从文件中提取行

问题描述

6 个解决方案

解决方案1 4 已采纳 2016-05-12 06:32:29

解决方案2 2 2016-05-12 06:33:19

解决方案3 1 2016-05-12 06:40:06

解决方案4 1 2016-05-12 07:23:08

解决方案5 0 2016-05-12 06:56:14

解决方案6 0 2016-05-12 08:41:02

解决方案1
4 已采纳 2016-05-12 06:32:29

解决方案2
2 2016-05-12 06:33:19

解决方案3
1 2016-05-12 06:40:06

解决方案4
1 2016-05-12 07:23:08

解决方案5
0 2016-05-12 06:56:14

解决方案6
0 2016-05-12 08:41:02