AWK - 解析 SQL output

Question

I have a SQL output something like below from the output of a custom tool.我有一个 SQL output 来自自定义工具的 output 的类似下面的东西。 Would appreciate any help in finding what I am doing incorrectly.将不胜感激任何帮助找出我做错了什么。

column1                  | column2 | column3 | column4 | column5 | column6 |     column7     | column8 | column9 |        column10            |          column11          
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------
 cec75                   | 1234     | 007    |         |    2810 |         | SOME_TEXT       |         |         | 2020-12-07 20:28:46.865+00 | 2020-12-08 06:40:10.231635+00
(1 row)

I am trying to pipe this output the columns I need in my case column1 , column2 , and column7 .我正在尝试 pipe 这个 output 在我的情况下我需要的列column1 ， column2和column7 。 I have tried piping out like this but it just prints column1我尝试过这样的管道，但它只打印column1

tool check | awk '{print $1, $2}'

column1 |
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+------------------------------- 
cec75 |
(1 row)

It would be nice to have something like this.有这样的东西会很好。

ce7c5,1234,SOME_TEXT

My file contents我的文件内容


                  column1                  | column2 | column3 | column4 | column5 | column6 |     column7     | column8 | column9 |        column10         |          column11          
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+-------------------------------
 6601c | 2396     | 123         |             |               9350 |                       | SOME_TEXT |               |                | 2020-12-07 22:49:01.023+00 | 2020-12-08 07:22:37.419669+00
(1 row)


                  column1                  | column2 | column3 | column4 | column5 | column6 |     column7     | column8 | column9 |        column10         |          column11          
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+-------------------------------
 cec75 | 1567     | 007        |             |               2810 |                       | SOME_TEXT |               |                | 2020-12-07 20:28:46.865+00 | 2020-12-08 07:28:10.319888+00
(1 row)

Answer 1

You need to set correct FS and somehow filters out undesired (junk) lines.您需要设置正确的FS并以某种方式过滤掉不需要的（垃圾）行。 I would do it following way.我会按照以下方式进行。 Let file.txt content be:让file.txt内容为：

column1                  | column2 | column3 | column4 | column5 | column6 |     column7     | column8 | column9 |        column10            |          column11          
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------
 cec75                   | 1234     | 007    |         |    2810 |         | SOME_TEXT       |         |         | 2020-12-07 20:28:46.865+00 | 2020-12-08 06:40:10.231635+00
(1 row)

then然后

awk 'BEGIN{FS="[[:space:]]+\\|[[:space:]]+";OFS=","}(NR>=2 && NF>=2){print $1,$2,$7}' file.txt

output: output：

cec75,1234,2020-12-07 20:28:46.865+00

Explanation: I set field separator ( FS ) to one or more :space: literal |说明：我将字段分隔符 ( FS ) 设置为一个或多个:space: literal | one or more :space: where :space: means any whitespace.一个或多个:space:其中:space:表示任何空格。 Depending on your data you might elect to use zero or more rather than one or more - to do so replace + with * .根据您的数据，您可能会选择使用零个或多个而不是一个或多个 - 为此将+替换为* 。 For every line which is not first one (this filter out header) and has at least 2 fields (this filter out line with - and + and (1 row) ) I print content of 1st column followed by , followed by content of 2nd column followed by , followed by content of 7th column.对于不是第一行的每一行（这个过滤掉标题）并且至少有2个字段（这个过滤掉带有-和+和(1 row)的行）我打印第一列的内容,然后是第二列的内容其次是,然后是第 7 列的内容。

Answer 2

EDIT: Since OP added edited set of samples, so adding this solution now.编辑：由于 OP 添加了经过编辑的样本集，因此现在添加此解决方案。 This considers that you want to print lines after lines which starts from --- .这认为您要在从---开始的行之后打印行。

awk -F'[[:space:]]*\\|[[:space:]]*' '/^---/{found=1;next} found{print $1,$2,$7;found=""}' Input_file

OR或者

your_command | 
awk -F'[[:space:]]*\\|[[:space:]]*' '/^---/{found=1;next} found{print $1,$2,$7;found=""}'

Answer 3

Description:描述：

Command line switches...命令行开关...

The delimiter is |分隔符是| surrounded by spaces.被空间包围。 (Note that we need to use a couple of \ 's to escape | if we feed the regex for the delimiter in from the command line.) （请注意，如果我们从命令行输入分隔符的正则表达式，我们需要使用几个\来转义| 。）
In addition to input delimiter (input field separator) the output delimiter (output field separator) can also be set using a command line switch.除了输入分隔符（输入字段分隔符）之外，output 分隔符（输出字段分隔符）也可以使用命令行开关进行设置。

The awk script... awk脚本...

If a header is encountered or a ( is seen on a line, it's not a valid line; so, just ignore it.如果遇到 header 或(在一行上看到，它不是有效行；所以，忽略它。
If the line now has any alphanumeric characters, it's now a valid line to operate on;如果该行现在有任何字母数字字符，则它现在是可以操作的有效行； so, and we strip the leading spaces off the line, and then print the columns we want.所以，我们从行中去掉前导空格，然后打印我们想要的列。

tool check | awk -F' *\\| *' -v OFS=, '/column|\(/ { next } /[[:alnum:]]/ { sub(/^ +/, ""); print $1, $2, $7 }'

Examining the data more closely... It looks as though the date-stamp (which always has a : in it) might be present on all valid records... If so, the script can be reduced to something much more simple.更仔细地检查数据......看起来好像日期戳（其中总是有一个:可能出现在所有有效记录上......如果是这样，脚本可以简化为更简单的东西。

tool check | awk -F' *\\| *' -v OFS=, '$10 ~ /:/ { sub(/^ +/, ""); print $1, $2, $7 }'

AWK - 解析 SQL output

问题描述

3 个解决方案

解决方案1
2 2020-12-08 10:26:09

解决方案2
1 2020-12-08 07:32:11

解决方案3
1 已采纳 2020-12-08 08:39:39

AWK - 解析 SQL output

问题描述

3 个解决方案

解决方案1 2 2020-12-08 10:26:09

解决方案2 1 2020-12-08 07:32:11

解决方案3 1 已采纳 2020-12-08 08:39:39

解决方案1
2 2020-12-08 10:26:09

解决方案2
1 2020-12-08 07:32:11

解决方案3
1 已采纳 2020-12-08 08:39:39