简体   繁体   中英

Wrap a single oversize column with awk / bash (pretty print)

Im having this table structure (assume that the delimiters are tabs):

AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description which will easily extend the recommended output width of 80 characters.
 03  Etim  Last description

What i want is this:

AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery
           long description which
           will easily extend the
           recommended output width
           of 80 characters.
 03  Etim  Last description

That means I want to split $3 into an array of strings with predefined WIDTH , where the first element is appended "normally" to the current line and all subsequent elements get a new line width identation according to the padding of the first two columns (padding could also be fixed if thats easier).

Alternatively, the text in $0 could be split by a GLOBAL_WIDTH (eg 80 chars) into first string and "rest" -> first string gets printed "normally" with printf, the rest is split by GLOBAL_WIDTH - (COLPAD1 + COLPAD2) and appended width new lines as above.

I tried to work with fmt and fold after my awk formatting (which is basically just putting headings to the table) but they do not reflect awk's field perceptance of course.

How can I achieve this using bash tools and / or awk?

First build a test file (called file.txt ):

echo "AA  BBBB  CCC
01  Item  Description here
02  Meti  A very very veeeery long description which will easily extend the recommended output width of 80 characters.
03  Etim  Last description" > file.txt

Now the script (called ./split-columns.sh ):

#!/bin/bash
FILE=$1

#find position of 3rd column (starting with 'CCC')
padding=`cat $FILE | head -n1 |  grep -aob 'CCC' | grep -oE '[0-9]+'`
paddingstr=`printf "%-${padding}s" ' '`

#set max length
maxcolsize=50
maxlen=$(($padding + $maxcolsize))

cat $FILE | while read line; do 
  #split the line only if it exceeds the desired length
  if [[ ${#line} -gt $maxlen ]] ; then 
    echo "$line" | fmt -s -w$maxcolsize - | head -n1
    echo "$line" | fmt -s -w$maxcolsize - | tail -n+2 | sed "s/^/$paddingstr/"
  else
    echo "$line";
  fi; 
done;

Finally run it with the file as a single argument

./split-columns.sh file.txt > fixed-width-file.txt

Output will be:

AA  BBBB  CCC
01  Item  Description here
02  Meti  A very very veeeery long description
          which will easily extend the recommended output
          width of 80 characters.
03  Etim  Last description

You can try Perl one-liner

perl -lpe ' s/(.{20,}?)\s/$1\n\t   /g ' file

with the given inputs

$ cat thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description which will easily extend the recommended output width of 80 characters.
 03  Etim  Last description

$ perl -lpe ' s/(.{20,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description
           here
 02  Meti  A very very
           veeeery long description
           which will easily extend
           the recommended output
           width of 80 characters.
 03  Etim  Last description

$

If you want to try with length window of 30/40/50

$ perl -lpe ' s/(.{30,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery
           long description which will easily
           extend the recommended output width
           of 80 characters.
 03  Etim  Last description

$ perl -lpe ' s/(.{40,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description
           which will easily extend the recommended
           output width of 80 characters.
 03  Etim  Last description

$ perl -lpe ' s/(.{50,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description which
           will easily extend the recommended output width of
           80 characters.
 03  Etim  Last description

$
#!/usr/bin/awk -f
# Read standard input, which should be a file of lines each line
#  containing tab-separated strings.  The string values may be very long.
# Columnate the output by
# wrapping long strings onto multiple lines within each field's
#  specified length.
# Arguments are numeric field lengths.  If an input line contains more
# values than the # of field lengths supplied, the last field length will 
# be re-used.
#
# arguments are the field lengths
# invoke like this: wrapcolumns 30 40 40

BEGIN {
FS="    ";
for (i = 1; i < ARGC; i++) {
    fieldlengths[i-1] = ARGV[i];
    ARGV[i]="";
    }
if (ARGC < 2) {
    print "usage: wrapcolumns length1 ... lengthn";
    exit;
}
}


function blanks(n) {
    result = "                                  ";
    while (length(result) < n) {
        result = result result;
    }
    return substr(result, 1, n);
}

{

# ARGC - 1 is the length of the fieldlengths array
# So ARGC - 2 is the index of its last element because its zero-origin.
# if the input line has more fields than the fieldlengths array,
# use the last element.

# any nonempty fields left?
gotanyleft = 1;

while (gotanyleft == 1) {
    gotanyleft = 0;
    for (i = 1; i <= NF; i++) {
        # length of the current field
        len = (ARGC - 2 < i) ? (fieldlengths[ARGC - 2]) : fieldlengths[i - 1];
        # print that much of the current field and remove that much from the front
        printf "%s", substr($(i) blanks(len), 1, len) ":::"
    $(i) = substr($(i), len + 1);
    if ($(i) != "") {
        gotanyleft = 1;
    }
    }
    print ""
}

}

loop-free awk -solution :

{m,g}awk -v ______="${WIDTH}" 'BEGIN {
 1      OFS = ""
 1       FS = "\t"
 1      ___ = "\32\23"
 1       __ = sprintf("\n%*s", 
                     (_+=_^=_<_)+_^!_+(_+=_____=_+=_+_)+_____,__)
 1     ____ = sprintf("%*s",______-length(__),"")
 1                                   gsub(".",".",____)
       sub("[.].......$","..?.?.?.?.?.?.?.[ ]",____)
 1   ______ = _

 } $!NF = sprintf("%.*s %*s %-*s %-s", _<_,_= $NF,_____,
                  $2,______, $--NF, substr("",gsub(____,
                 ("&")___,_) * gsub("("(___)")+$","",_),
                          __ * gsub( (___), (__),_) )_)'

|

AAA BBBB         CCC
 01 Item         Description here
 02 Meti         A very very veeeery long description which 
                 will easily extend the recommended output 
                 width of 80 characters.
 03 Etim         Last description

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM