用 awk / bash 包装单个超大列（漂亮的打印）

Question

我有这个表结构（假设分隔符是制表符）：

AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description which will easily extend the recommended output width of 80 characters.
 03  Etim  Last description

我想要的是这个：

AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery
           long description which
           will easily extend the
           recommended output width
           of 80 characters.
 03  Etim  Last description

这意味着我想将$3拆分为具有预定义WIDTH的字符串数组，其中第一个元素“通常”附加到当前行，所有后续元素根据前两列的填充（填充如果那更容易，也可以修复）。

或者， $0中的文本可以由GLOBAL_WIDTH （例如 80 个字符）拆分为第一个字符串，“rest”-> 第一个字符串使用 printf“正常”打印，其余部分由GLOBAL_WIDTH - (COLPAD1 + COLPAD2)并附加如上所述的宽度新行。

我尝试使用fmt并在我的 awk 格式化后fold （基本上只是将标题放到表格中），但它们当然不能反映 awk 的字段感知。

如何使用 bash 工具和/或 awk 来实现这一点？

Answer 1

首先构建一个测试文件（称为file.txt ）：

echo "AA  BBBB  CCC
01  Item  Description here
02  Meti  A very very veeeery long description which will easily extend the recommended output width of 80 characters.
03  Etim  Last description" > file.txt

现在脚本（称为./split-columns.sh ）：

#!/bin/bash
FILE=$1

#find position of 3rd column (starting with 'CCC')
padding=`cat $FILE | head -n1 |  grep -aob 'CCC' | grep -oE '[0-9]+'`
paddingstr=`printf "%-${padding}s" ' '`

#set max length
maxcolsize=50
maxlen=$(($padding + $maxcolsize))

cat $FILE | while read line; do 
  #split the line only if it exceeds the desired length
  if [[ ${#line} -gt $maxlen ]] ; then 
    echo "$line" | fmt -s -w$maxcolsize - | head -n1
    echo "$line" | fmt -s -w$maxcolsize - | tail -n+2 | sed "s/^/$paddingstr/"
  else
    echo "$line";
  fi; 
done;

最后将文件作为单个参数运行

./split-columns.sh file.txt > fixed-width-file.txt

输出将是：

AA  BBBB  CCC
01  Item  Description here
02  Meti  A very very veeeery long description
          which will easily extend the recommended output
          width of 80 characters.
03  Etim  Last description

Answer 2

你可以试试 Perl one-liner

perl -lpe ' s/(.{20,}?)\s/$1\n\t   /g ' file

使用给定的输入

$ cat thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description which will easily extend the recommended output width of 80 characters.
 03  Etim  Last description

$ perl -lpe ' s/(.{20,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description
           here
 02  Meti  A very very
           veeeery long description
           which will easily extend
           the recommended output
           width of 80 characters.
 03  Etim  Last description

$

如果您想尝试使用 30/40/50 的长度窗口

$ perl -lpe ' s/(.{30,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery
           long description which will easily
           extend the recommended output width
           of 80 characters.
 03  Etim  Last description

$ perl -lpe ' s/(.{40,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description
           which will easily extend the recommended
           output width of 80 characters.
 03  Etim  Last description

$ perl -lpe ' s/(.{50,}?)\s/$1\n\t   /g ' thurse.txt
AAA  BBBB  CCC
 01  Item  Description here
 02  Meti  A very very veeeery long description which
           will easily extend the recommended output width of
           80 characters.
 03  Etim  Last description

$

Answer 3

#!/usr/bin/awk -f
# Read standard input, which should be a file of lines each line
#  containing tab-separated strings.  The string values may be very long.
# Columnate the output by
# wrapping long strings onto multiple lines within each field's
#  specified length.
# Arguments are numeric field lengths.  If an input line contains more
# values than the # of field lengths supplied, the last field length will 
# be re-used.
#
# arguments are the field lengths
# invoke like this: wrapcolumns 30 40 40

BEGIN {
FS="    ";
for (i = 1; i < ARGC; i++) {
    fieldlengths[i-1] = ARGV[i];
    ARGV[i]="";
    }
if (ARGC < 2) {
    print "usage: wrapcolumns length1 ... lengthn";
    exit;
}
}


function blanks(n) {
    result = "                                  ";
    while (length(result) < n) {
        result = result result;
    }
    return substr(result, 1, n);
}

{

# ARGC - 1 is the length of the fieldlengths array
# So ARGC - 2 is the index of its last element because its zero-origin.
# if the input line has more fields than the fieldlengths array,
# use the last element.

# any nonempty fields left?
gotanyleft = 1;

while (gotanyleft == 1) {
    gotanyleft = 0;
    for (i = 1; i <= NF; i++) {
        # length of the current field
        len = (ARGC - 2 < i) ? (fieldlengths[ARGC - 2]) : fieldlengths[i - 1];
        # print that much of the current field and remove that much from the front
        printf "%s", substr($(i) blanks(len), 1, len) ":::"
    $(i) = substr($(i), len + 1);
    if ($(i) != "") {
        gotanyleft = 1;
    }
    }
    print ""
}

}

Answer 4

无循环awk解决方案：

{m,g}awk -v ______="${WIDTH}" 'BEGIN {
 1      OFS = ""
 1       FS = "\t"
 1      ___ = "\32\23"
 1       __ = sprintf("\n%*s", 
                     (_+=_^=_<_)+_^!_+(_+=_____=_+=_+_)+_____,__)
 1     ____ = sprintf("%*s",______-length(__),"")
 1                                   gsub(".",".",____)
       sub("[.].......$","..?.?.?.?.?.?.?.[ ]",____)
 1   ______ = _

 } $!NF = sprintf("%.*s %*s %-*s %-s", _<_,_= $NF,_____,
                  $2,______, $--NF, substr("",gsub(____,
                 ("&")___,_) * gsub("("(___)")+$","",_),
                          __ * gsub( (___), (__),_) )_)'

|

AAA BBBB         CCC
 01 Item         Description here
 02 Meti         A very very veeeery long description which 
                 will easily extend the recommended output 
                 width of 80 characters.
 03 Etim         Last description

用 awk / bash 包装单个超大列（漂亮的打印）

问题描述

4 个解决方案

解决方案1
3 已采纳 2019-03-07 14:13:18

解决方案2
3 2019-03-08 09:36:57

解决方案3
0 2022-06-24 22:37:19

解决方案4
0 2022-06-25 14:35:23

用 awk / bash 包装单个超大列（漂亮的打印）

问题描述

4 个解决方案

解决方案1 3 已采纳 2019-03-07 14:13:18

解决方案2 3 2019-03-08 09:36:57

解决方案3 0 2022-06-24 22:37:19

解决方案4 0 2022-06-25 14:35:23

解决方案1
3 已采纳 2019-03-07 14:13:18

解决方案2
3 2019-03-08 09:36:57

解决方案3
0 2022-06-24 22:37:19

解决方案4
0 2022-06-25 14:35:23