简体   繁体   English

如何gzip awk命令的结果输出?

[英]How to gzip the resulting output of an awk command?

I tried this but it did not work. 我试过了,但是没有用。 How can I generate the output to be a gz file from an awk command? 如何通过awk命令将输出生成为gz文件?

 awk -v someVariable="$VAR1" '
        BEGIN {
         . . .
        }

        {
           SOME CODES HERE
        }
        END {}
    ' $FILES> gzip $RESULTING_OUTPUT

You can EITHER pipe your output to gzip outside of awk (as Tom suggested), or you can send output through pipes from inside awk. 您可以将输出通过管道传输到awk外部的gzip中(如Tom所建议),也可以通过管道从awk 内部发送输出。 Like this: 像这样:

awk '
  {
    print | "gzip > /path/to/output.gz";
  }
' inputfile

This has a tremendous advantage in certain situations. 在某些情况下,这具有巨大的优势。 For example, if you want to split a really long log file into hourly chunks, you can build your output command as a variable, then pipe to the variable. 例如,如果您想将一个很长的日志文件分成几个小时,则可以将输出命令构建为变量,然后通过管道传递到该变量。

awk '

  # given:
  #   Mar 20 13:29:12 servername some message

  BEGIN {
    m["jan"]="01"; m["feb"]="02"; m["mar"]="03"; m["apr"]="04"; 
    m["may"]="05"; m["jun"]="06"; m["jul"]="07"; m["aug"]="08"; 
    m["sep"]="09"; m["oct"]="10"; m["nov"]="11"; m["dec"]="12";
  }

  {
    output=sprintf("gzip -9 > /var/log/split/%s-%s-%s.log.gz", m[tolower($1)], $2, substr($3,1,2));
    print | output
  }' input.log

With this usage, your output gzip command line is re-evaluated for every line of input, and awk doesn't close the pipe unless it's told to manually, or awk runs out of input and exits. 通过这种用法, 将对输入的每一行重新评估输出gzip命令行,并且awk 不会关闭管道,除非被告知要手动进行,否则awk将耗尽输入并退出。

My own use case for this was that we were gathering web server logs from a CDN that were not in chronological order. 我自己的用例是,我们从CDN收集了按时间顺序排列的Web服务器日志。 The logs were way too large for sort , but could be handled when split into hourly chunks. 日志太大,无法进行sort ,但拆分成每小时的块可以处理。

YMMV. 因人而异。 The best solution depends on what you're actually trying to achieve, which you haven't told us. 最佳解决方案取决于您实际想要实现的目标,而您尚未告诉我们。

You need to pipe the output to gzip , then redirect then output to a file: 您需要将输出通过管道传递到gzip然后重定向然后输出到文件:

awk '...' $FILES | gzip > "$RESULTING_OUTPUT"

Note that capital letters for variables names are not recommended, as they may clash with shell internal variables. 请注意,不建议使用大写字母表示变量名称,因为它们可能与shell内部变量冲突。 Also, $FILES looks suspiciously like it may contain a list of more than one file name. 另外, $FILES看起来可疑,因为它可能包含一个以上文件名的列表。 You should really be using an array, which you can pass like "${files[@]}" 您实际上应该使用一个数组,可以像"${files[@]}"这样传递

This worked 这工作

 awk -v someVariable="$VAR1" '
            BEGIN {
             . . .
            }

            {
               SOME CODES HERE
            }
            END {}
        ' $FILES> $RESULTING_OUTPUT
    gzip $RESULTING_OUTPUT

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM