简体   繁体   中英

Transform a pretty-printed table to a single line with separators, using Awk

Trying to clean up output from a Python client. This is an example:

+--------------------------+-----------+
| Text                     | Test      |
+--------------------------+-----------+
| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |
+--------------------------+-----------+

I started by removing the top and bottom by piping the output with:

Command_Output | tail -n +4 | head -n -1 |

So now we have the following:

| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |
| 111-222-333-444-55555555 | 123456789 |

Now I'm trying to remove the pipes in the table and convert the table to a single comma-separated line. It's important to still keep the correlation between the two numbers, though, so maybe I should use two delimiters. Perhaps the final output should look like the following:

111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789

So now I'm at this point:

Command_Output | tail -n +4 | head -n -1 | awk '{$3 = "~"; print $0;}'

Can someone help me with the last part? I need to get the table into a single, comma-delimited line.

Atomiklan's own answer works, but:

  • is limited to a single group of input lines, all of which are output as a single output line.

  • uses several GNU -specific options, which won't generally work on non-Linux platforms.

  • uses 4 external processes, when 1 will do.

A generalized solution that outputs each block of lines sharing the same (conceptually) first column value as a single line, using only a single, POSIX-compliant awk command (still assumes a 2-column layout):

 ... | awk '
  NR <= 3 || /^\+/ { next }                          # skip header and footer
  prev != "" && prev != $2 { printf "\n"; fsep="" }  # see if new block is starting
  { printf "%s", fsep $2 "~" $4; fsep=","; prev=$2 } # print line at hand
  END { printf "\n" }                                # print final newline
'

To handle a variable number of columns :

... | awk -F ' *\\| *' '
  NR <= 3 || /^\+/ { next }                          # skip header and footer
  {                                                  # process each data row
    fsep=""; first=1
    for (i=1; i<=NF; ++i) {                          # loop over all fields
      if ($i == "") continue                         # skip empty fields
      # See if a new block is starting and print the appropriate record
      # separator.      
      if (first) {  
        if (prev != "") printf (prev != $i ? "\n" : ",") 
        prev=$i                                      # save record's 1st nonempty field
        first=0                                      # done with 1st nonempty field
      }
      printf "%s", fsep $i                           # print field at hand.
      fsep="~"                                       # set separator for subsequent fields
    }
  }
  END { printf "\n" }                                # print trailing newline
'

This will work in all awks for any number of input columns:

$ awk -F ' *[|] *' -v OFS='~' 'NF>1 && ++c>1 {$1=$1; gsub(/^~|~$/,""); printf "%s%s", (c>2?",":""), $0} END{print ""}' file
111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789
Command_Output | tail -n +4 | head -n -1 | awk -vORS=, '{ print $2 "~" $4 }' | sed 's/,$/\n/'

谢谢您的帮助

A simpler awk-based solution:

Command | awk -vORS=, '($1=="|" && NR>3 ) {print $2"~"$4}'

This, however, leaves a trailing , at the end. To fix that:

Command | awk -vORS= '($1=="|" && NR>3 ) {if (NR>4) {print ","}; print $2"~"$4}'

which gives:

111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789,111-222-333-444-55555555~123456789

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM