简体   繁体   中英

How to use cut and awk commands to extract text input in a tabular format?

I have file input.txt as below:

filename: test1.v

BUG: bug 102 is fixed by some user
IO_CHANGE: there is no io_change for this version
FEATURE: no feature added

filename: test2.v

BUG: bug 103 is fixed by some user 
also bug 105 is fixed
IO_CHANGE: there is no io_change for this version
FEATURE: yes feature number 3 also feature 23
and feature 34 is added

filename: test3.v

BUG: bug 104 is fixed by some user
FEATURE: yes feature number 2
IO_CHANGE: 

My Question:- sometimes there is a long description for BUG/FEATURE/IO_CHANGE which is coming in 2 lines or sometimes there is nothing in IO_CHANGE so it is blank. Output file should have list for all bugs then features and io_changes. Those 3 types can be in any order in input file, I need to find all bugs/features/io_changes from the file and list them column wise.

在此处输入图片说明

How about this. We store the values in an array for each file. Here i concatenate entries that appear on multiple rows.

awk 'function dump() {if (vc>0) 
        print fn, vals["BUG"], vals["FEATURE"], vals["IO_CHANGE"]
    } 
    BEGIN {FS=":";OFS="\t";vc=0} 
    FNR==1 {dump();val=""; delete vals; fn=FILENAME; vc=0} 
    NF>1 {val=$1; vals[val]=vals[val] $2; vc++} 
    NF==1 {vals[val] = vals[val] " " $1} 
    END{dump()}' test*v
  1. The dump() function is what writes a record out to the file.
  2. The BEGIN assigns the ":" to the field separator (so no ":" are allowed as text in fields in this solution). The output is delimited by tab.
  3. Then at the start of each file (FNR=1) we dump records if we have any, and then we reset or collections.
  4. Then, if a line has a ":" (which would result in NF>1) we keep track of which value we are setting and store it in the array. If there is no ":" (making NF==1) then we just add to the last value we were adding to.
  5. Finally, at the end of the last file, we dump the contents one last time.

Sets a variable if phrase is found, if one of the other phrases is found unsets it, then save the lines to array based on filename.
Removes everything before : on each line
Then prints the line in columns

#!/bin/bash

awk     'BEGIN{printf("%-8s%-60s%-60s%-20s\n\n","FILE","|BUG","|IO","|FEATURE")}
    /BUG/{a=1}/IO_CHANGE:/ || /FEATURE/{a=0} {if (a){Bug[FILENAME]=Bug[FILENAME]""$0" "}}
    /IO_CHANGE:/{b=1}/BUG/ || /FEATURE/{b=0} {if (b){IO[FILENAME]=IO[FILENAME]$0" "}}
    /FEATURE/{c=1}/IO_CHANGE:/ || /BUG/{c=0} {if (c){Feat[FILENAME]=Feat[FILENAME]$0" "}}
     END{
             for (k in Bug){
                    Bug[k] = substr(Bug[k],index(Bug[k],":"))
                    IO[k] = substr(IO[k],index(IO[k],":"))
                    Feat[k] = substr(Feat[k],index(Feat[k],":"))
                    printf("%-8s%-60s%-60s%-20s\n\n","|"k,"|"Bug[k],"|"IO[k],"|"Feat[k])}}
'  test*v

Unfortunately this wont print multiple lines for each file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM