如何使用cut和awk命令提取表格格式的文本输入？

Question

I have file input.txt as below: 我有以下文件input.txt：

filename: test1.v 文件名：test1.v

BUG: bug 102 is fixed by some user
IO_CHANGE: there is no io_change for this version
FEATURE: no feature added

filename: test2.v 文件名：test2.v

BUG: bug 103 is fixed by some user 
also bug 105 is fixed
IO_CHANGE: there is no io_change for this version
FEATURE: yes feature number 3 also feature 23
and feature 34 is added

filename: test3.v 文件名：test3.v

BUG: bug 104 is fixed by some user
FEATURE: yes feature number 2
IO_CHANGE:

My Question:- sometimes there is a long description for BUG/FEATURE/IO_CHANGE which is coming in 2 lines or sometimes there is nothing in IO_CHANGE so it is blank. 我的问题：-有时BUG / FEATURE / IO_CHANGE的描述很长，每行2行，有时IO_CHANGE中没有任何内容，因此为空白。 Output file should have list for all bugs then features and io_changes. 输出文件应包含所有错误的列表，然后列出功能和io_changes。 Those 3 types can be in any order in input file, I need to find all bugs/features/io_changes from the file and list them column wise. 这3种类型可以在输入文件中以任何顺序排列，我需要从文件中查找所有bug /功能/ io_changes，并逐列列出它们。

在此处输入图片说明

Answer 1

How about this. 这个怎么样。 We store the values in an array for each file. 我们将值存储在每个文件的数组中。 Here i concatenate entries that appear on multiple rows. 在这里，我将出现在多行上的条目连接起来。

awk 'function dump() {if (vc>0) 
        print fn, vals["BUG"], vals["FEATURE"], vals["IO_CHANGE"]
    } 
    BEGIN {FS=":";OFS="\t";vc=0} 
    FNR==1 {dump();val=""; delete vals; fn=FILENAME; vc=0} 
    NF>1 {val=$1; vals[val]=vals[val] $2; vc++} 
    NF==1 {vals[val] = vals[val] " " $1} 
    END{dump()}' test*v

The dump() function is what writes a record out to the file. dump（）函数将记录写出到文件中。
The BEGIN assigns the ":" to the field separator (so no ":" are allowed as text in fields in this solution). BEGIN将“：”分配给字段分隔符（因此在此解决方案中，字段中的文本不允许使用“：”）。 The output is delimited by tab. 输出由制表符分隔。
Then at the start of each file (FNR=1) we dump records if we have any, and then we reset or collections. 然后在每个文件的开头（FNR = 1），如果有记录，我们将转储记录，然后重置或回收。
Then, if a line has a ":" (which would result in NF>1) we keep track of which value we are setting and store it in the array. 然后，如果一行中有一个“：”（这将导致NF> 1），我们将跟踪所设置的值并将其存储在数组中。 If there is no ":" (making NF==1) then we just add to the last value we were adding to. 如果没有“：”（使NF == 1），那么我们将添加到最后添加的值。
Finally, at the end of the last file, we dump the contents one last time. 最后，在最后一个文件的末尾，我们最后一次转储了内容。

Answer 2

Sets a variable if phrase is found, if one of the other phrases is found unsets it, then save the lines to array based on filename. 如果找到词组，则设置变量，如果找到其他词组之一，则将其取消设置，然后根据文件名将行保存到数组。
Removes everything before : on each line 删除：之前的所有内容
Then prints the line in columns 然后在列中打印行

#!/bin/bash

awk     'BEGIN{printf("%-8s%-60s%-60s%-20s\n\n","FILE","|BUG","|IO","|FEATURE")}
    /BUG/{a=1}/IO_CHANGE:/ || /FEATURE/{a=0} {if (a){Bug[FILENAME]=Bug[FILENAME]""$0" "}}
    /IO_CHANGE:/{b=1}/BUG/ || /FEATURE/{b=0} {if (b){IO[FILENAME]=IO[FILENAME]$0" "}}
    /FEATURE/{c=1}/IO_CHANGE:/ || /BUG/{c=0} {if (c){Feat[FILENAME]=Feat[FILENAME]$0" "}}
     END{
             for (k in Bug){
                    Bug[k] = substr(Bug[k],index(Bug[k],":"))
                    IO[k] = substr(IO[k],index(IO[k],":"))
                    Feat[k] = substr(Feat[k],index(Feat[k],":"))
                    printf("%-8s%-60s%-60s%-20s\n\n","|"k,"|"Bug[k],"|"IO[k],"|"Feat[k])}}
'  test*v

Unfortunately this wont print multiple lines for each file 不幸的是，这不会为每个文件打印多行

如何使用cut和awk命令提取表格格式的文本输入？

问题描述

2 个解决方案

解决方案1
1 2014-06-20 06:54:45

解决方案2
0

如何使用cut和awk命令提取表格格式的文本输入？

问题描述

2 个解决方案

解决方案1 1 2014-06-20 06:54:45

解决方案2 0

解决方案1
1 2014-06-20 06:54:45

解决方案2
0