使用带分隔符的 AWK 打印特定列

Question

我的文件如下所示：

+------------------------------------------+---------------+----------------+------------------+------------------+-----------------+
| Message                                  | Status        | Adress         | Changes          | Test             | Calibration     |
|------------------------------------------+---------------+----------------+------------------+------------------+-----------------|
| Hello World                              | Active        | up             |                1 |               up |            done |
| Hello Everyone Here                      | Passive       | up             |                2 |             down |            none |
| Hi there. My name is Eric. How are you?  | Down          | up             |                3 |         inactive |            done |
+------------------------------------------+---------------+----------------+------------------+------------------+-----------------+
+----------------------------+---------------+----------------+------------------+------------------+-----------------+
| Message                    | Status        | Adress         | Changes          | Test             | Calibration     |
|----------------------------+---------------+----------------+------------------+------------------+-----------------|
| What's up?                 | Active        | up             |                1 |               up |            done |
| Hi. I'm Otilia             | Passive       | up             |                2 |             down |            none |
| Hi there. This is Marcus   | Up            | up             |                3 |         inactive |            done |
+----------------------------+---------------+----------------+------------------+------------------+-----------------+

我想使用 AWK 提取特定列。我可以使用 CUT 来完成； 但是，当每个表的长度根据每列中存在的字符数而变化时，我没有得到所需的 output。

cat File.txt | cut -c -44
+------------------------------------------+
| Message                                  |
|------------------------------------------+
| Hello World                              |
| Hello Everyone Here                      |
| Hi there. My name is Eric. How are you?  |
+------------------------------------------+
+----------------------------+--------------
| Message                    | Status
|----------------------------+--------------
| What's up?                 | Active
| Hi. I'm Otilia             | Passive
| Hi there. This is Marcus   | Up
+----------------------------+--------------

或者

cat File.txt | cut -c 44-60
+---------------+
| Status        |
+---------------+
| Active        |
| Passive       |
| Down          |
+---------------+
--+--------------
  | Adress
--+--------------
  | up
  | up
  | up
--+--------------

我尝试使用 AWK 但我不知道如何添加 2 个不同的分隔符来处理所有行。

cat File.txt | awk 'BEGIN {FS="|";}{print $2,$3}'

 Message                                    Status
------------------------------------------+---------------+----------------+------------------+------------------+-----------------
 Hello World                                Active
 Hello Everyone Here                        Passive
 Hi there. My name is Eric. How are you?    Down


 Message                      Status
----------------------------+---------------+----------------+------------------+------------------+-----------------
 What's up?                   Active
 Hi. I'm Otilia               Passive
 Hi there. This is Marcus     Up

我要找的output：

+------------------------------------------+
| Message                                  |
|------------------------------------------+
| Hello World                              |
| Hello Everyone Here                      |
| Hi there. My name is Eric. How are you?  |
+------------------------------------------+
+----------------------------+
| Message                    |
|----------------------------+
| What's up?                 | 
| Hi. I'm Otilia             | 
| Hi there. This is Marcus   | 
+----------------------------+

或者

+------------------------------------------+---------------+
| Message                                  | Status        |
|------------------------------------------+---------------+
| Hello World                              | Active        |
| Hello Everyone Here                      | Passive       |
| Hi there. My name is Eric. How are you?  | Down          |
+------------------------------------------+---------------+
+----------------------------+---------------+
| Message                    | Status        | 
|----------------------------+---------------+
| What's up?                 | Active        | 
| Hi. I'm Otilia             | Passive       | 
| Hi there. This is Marcus   | Up            | 
+----------------------------+---------------+

或随机其他列

+------------------------------------------+----------------+------------------+
| Message                                  | Adress         | Test             |
|------------------------------------------+----------------+------------------+
| Hello World                              | up             |               up |
| Hello Everyone Here                      | up             |             down |
| Hi there. My name is Eric. How are you?  | up             |         inactive |
+------------------------------------------+----------------+------------------+
+----------------------------+---------------+------------------+
| Message                    |Adress         | Test             |
|----------------------------+---------------+------------------+
| What's up?                 |up             |               up |
| Hi. I'm Otilia             |up             |             down |
| Hi there. This is Marcus   |up             |         inactive |
+----------------------------+---------------+------------------+

提前致谢。

Answer 1

使用GNU awk一个想法：

awk -v fldlist="2,3" '
BEGIN { fldcnt=split(fldlist,fields,",") }                      # split fldlist into array fields[]

      { split($0,arr,/[|+]/,seps)                               # split current line on dual delimiters "|" and "+"
        for (i=1;i<=fldcnt;i++)                                 # loop through our array of fields (fldlist)
            printf "%s%s", seps[fields[i]-1], arr[fields[i]]    # print leading separator/delimiter and field
        printf "%s\n", seps[fields[fldcnt]]                     # print trailing separator/delimiter and terminate line
      }
' File.txt

笔记：

split() function 的第 4 个参数需要GNU awk （ seps == 分隔符数组；有关详细信息，请参阅gawk 字符串函数）
假设我们的字段分隔符（ | ， + ）不显示为数据的一部分
输入变量fldlist是一个逗号分隔的列列表，模拟将传递给cut的内容（例如，当一行以分隔符开头时，字段 #1 为空白）

对于fldlist="2,3"这会生成：

+------------------------------------------+---------------+
| Message                                  | Status        |
|------------------------------------------+---------------+
| Hello World                              | Active        |
| Hello Everyone Here                      | Passive       |
| Hi there. My name is Eric. How are you?  | Down          |
+------------------------------------------+---------------+
+----------------------------+---------------+
| Message                    | Status        |
|----------------------------+---------------+
| What's up?                 | Active        |
| Hi. I'm Otilia             | Passive       |
| Hi there. This is Marcus   | Up            |
+----------------------------+---------------+

对于fldlist="2,4,6"这会生成：

+------------------------------------------+----------------+------------------+
| Message                                  | Adress         | Test             |
|------------------------------------------+----------------+------------------+
| Hello World                              | up             |               up |
| Hello Everyone Here                      | up             |             down |
| Hi there. My name is Eric. How are you?  | up             |         inactive |
+------------------------------------------+----------------+------------------+
+----------------------------+----------------+------------------+
| Message                    | Adress         | Test             |
|----------------------------+----------------+------------------+
| What's up?                 | up             |               up |
| Hi. I'm Otilia             | up             |             down |
| Hi there. This is Marcus   | up             |         inactive |
+----------------------------+----------------+------------------+

对于fldlist="4,3,2"这会生成：

+----------------+---------------+------------------------------------------+
| Adress         | Status        | Message                                  |
+----------------+---------------|------------------------------------------+
| up             | Active        | Hello World                              |
| up             | Passive       | Hello Everyone Here                      |
| up             | Down          | Hi there. My name is Eric. How are you?  |
+----------------+---------------+------------------------------------------+
+----------------+---------------+----------------------------+
| Adress         | Status        | Message                    |
+----------------+---------------|----------------------------+
| up             | Active        | What's up?                 |
| up             | Passive       | Hi. I'm Otilia             |
| up             | Up            | Hi there. This is Marcus   |
+----------------+---------------+----------------------------+

再说一次？ ( fldlist="3,3,3" ):

+---------------+---------------+---------------+
| Status        | Status        | Status        |
+---------------+---------------+---------------+
| Active        | Active        | Active        |
| Passive       | Passive       | Passive       |
| Down          | Down          | Down          |
+---------------+---------------+---------------+
+---------------+---------------+---------------+
| Status        | Status        | Status        |
+---------------+---------------+---------------+
| Active        | Active        | Active        |
| Passive       | Passive       | Passive       |
| Up            | Up            | Up            |
+---------------+---------------+---------------+

如果您错误地尝试打印“第一”列，即fldlist="1" ：

+
|
|
|
|
|
+
+
|
|
|
|
|
+

Answer 2

如果GNU awk可用，请尝试 markp-fuso 的 nice 解决方案。 如果没有，这里有一个符合 posix 标准的替代方案：

#!/bin/bash

# define bash variables
cols=(2 3 6)                            # bash array of desired columns
col_list=$(IFS=,; echo "${cols[*]}")    # create a csv string

awk -v cols="$col_list" '
NR==FNR {
    if (match($0, /^[|+]/)) {           # the record contains a table
        if (match($0, /^[|+]-/))        # horizontally ruled line
            n = split($0, a, /[|+]/)    # split into columns
        else                            # "cell" line
            n = split($0, a, /\|/)
        len = 0
        for (i = 1; i < n; i++) {
            len += length(a[i]) + 1     # accumulated column position
            pos[FNR, i] = len
        }
    }
    next
}
{
    n = split(cols, a, /,/)             # split the variable `cols` on comma into an array
    for (i = 1; i <= n; i++) {
        col = a[i]
        if (pos[FNR, col] && pos[FNR, col+1]) {
            printf("%s", substr($0, pos[FNR, col], pos[FNR, col + 1] - pos[FNR, col]))
        }
    }
    print(substr($0, pos[FNR, col + 1], 1))
}
' file.txt file.txt

cols=(2 3 6)的结果如上所示：

+---------------+----------------+-----------------+
| Status        | Adress         | Calibration     |
+---------------+----------------+-----------------|
| Active        | up             |            done |
| Passive       | up             |            none |
| Down          | up             |            done |
+---------------+----------------+-----------------+
+---------------+----------------+-----------------+
| Status        | Adress         | Calibration     |
+---------------+----------------+-----------------|
| Active        | up             |            done |
| Passive       | up             |            none |
| Up            | up             |            done |
+---------------+----------------+-----------------+

它在第 1 遍中检测列宽，然后在第 2 遍中拆分列 position 上的行。
您可以使用在脚本开头分配的 bash 数组cols控制要打印的列。 请按递增顺序将数组分配给所需列号的列表。 如果您想以不同的方式使用 bash 变量，请告诉我。

使用带分隔符的 AWK 打印特定列

问题描述

2 个解决方案

解决方案1
1 2022-04-17 17:28:12

解决方案2
1 2022-04-18 00:07:08

使用带分隔符的 AWK 打印特定列

问题描述

2 个解决方案

解决方案1 1 2022-04-17 17:28:12

解决方案2 1 2022-04-18 00:07:08

解决方案1
1 2022-04-17 17:28:12

解决方案2
1 2022-04-18 00:07:08