简体   繁体   中英

How to store .txt file data in different columns of a CSV file using BASH?

I have a .txt file with the following data structure:

Scan Times:
 33.3 seconds
 77.4 seconds
 33.3 seconds
 77.4 seconds

Check Times:
 110.30 seconds
 72.99 seconds
 72.16 seconds
 110.30 seconds

Move Times:
 73.66 seconds
 90.77 seconds
 72.87 seconds
 71.75 seconds
 
Switch Times:
 92.0 seconds
 78.6 seconds
 77.8 seconds
 84.9 seconds

I now want to take that .txt file and create a CSV file that has the following format.

在此处输入图片说明

I have a very basic layout so far with my bash script but I am not sure how to proceed:

inputFiles=("./Successes/SuccessSummary.txt" "./Failures/FailSummary.txt")
touch results.csv

for file in "${inputFiles[@]}"
do 
    while IFS= read -r line
    do
        #echo $line
        if [ "$line" = "Scan Times:" ]
        then 
        fi

        if [ "$line" = "Check Times:" ]
        then 
        fi

        if [ "$line" = "Move Times:" ]
        then 
        fi
        
        if [ "$line" = "Switch Distances:" ]
        then 
        fi
    done < "$file"
done

Here's an awk script that does it:

#!/usr/bin/awk -f

BEGIN {
    OFS=","
    colnum=0
}

/:$/ {
    data[++colnum,1]=$0
    rownum=1
}

/seconds$/ {
    data[colnum,++rownum]=$1
}

END {
    for (r = 1; r <= rownum; r++) {
        for (c = 1; c <= colnum; c++) {
            printf "%s%s", data[c,r], (c == colnum ? RS : OFS)
        }
    }
}

Example:

$ ./pivot input.txt
Scan Times:,Check Times:,Move Times:,Switch Times:
33.3,110.30,73.66,92.0
77.4,72.99,90.77,78.6
33.3,72.16,72.87,77.8
77.4,110.30,71.75,84.9

Using any awk in any shell on every Unix box:

$ cat tst.awk
BEGIN { RS=""; FS="\n"; OFS="," }
{
    for (i=1; i<=NF; i++) {
        if (i > 1) {
            gsub(/[^0-9.]/,"",$i)
        }
        vals[i,NR] = $i
    }
}
END {
    for (i=1; i<=NF; i++) {
        for (j=1; j<=NR; j++) {
            printf "%s%s", vals[i,j], (j<NR ? OFS : ORS)
        }
    }
}

$ awk -f tst.awk file
Scan Times:,Check Times:,Move Times:,Switch Times:
33.3,110.30,73.66,92.0
77.4,72.99,90.77,78.6
33.3,72.16,72.87,77.8
77.4,110.30,71.75,84.9

If ed is available/acceptable with some help from unix/linux utilities.

With one file.

The script my_script

#!/bin/sh

ed -s "$1" <<-EOF
 g/.\\{1,\\}/s/^ //\\
 s/ seconds//
 w tmpa.$$
 %d
 r !pr -t4 -s, tmpa.$$
 d
 !rm tmpa.$$
 w result.csv
 %p
 Q
EOF

Then

./myscript ./Successes/SuccessSummary.txt

The output and content of the result.csv

Scan Times:,Check Times:,Move Times:,Switch Times:
33.3,110.30,73.66,92.0
77.4,72.99,90.77,78.6
33.3,72.16,72.87,77.8
77.4,110.30,71.75,84.9

With two files. (Just using the content of the first file with the second.)

#!/bin/sh

ed -s "$1" <<-EOF
 g/.\\{1,\\}/s/^ //\\
 s/ seconds//
 w tmpa.$$
 %d
 r !pr -t4 -s, tmpa.$$
 d
 w tmpa.$$
 E $2
 g/.\\{1,\\}/s/^ //\\
 s/ seconds//
 w tmpb.$$
 %d
 r !pr -t4 -s, tmpb.$$
 d
 w tmpb.$$
 %d
 r !pr -mts, tmpa.$$ tmpb.$$
 %p
 w result.csv
 !rm tmp[ab].$$
 Q
EOF

Then

./myscript ./Successes/SuccessSummary.txt ./Failures/FailSummary.txt

The output and content of the result.csv

Scan Times:,Check Times:,Move Times:,Switch Times:,Scan Times:,Check Times:,Move Times:,Switch Times:
33.3,110.30,73.66,92.0,33.3,110.30,73.66,92.0
77.4,72.99,90.77,78.6,77.4,72.99,90.77,78.6
33.3,72.16,72.87,77.8,33.3,72.16,72.87,77.8
77.4,110.30,71.75,84.9,77.4,110.30,71.75,84.9

  • The ed script has two temp files tmpa.$$ and tmpb.$$ but it is removed/deleted at the line where !rm tmpa ... is at.

  • The output is written at the file result.csv

  • Ed is a file editor not a scripting/programming language like say awk or bash and not everyone likes ed , but it is still an option/solution.

This might work for you (GNU sed, csplit & paste):

sed '/\S/!d;s/^ \| seconds//g' file |
csplit -zs - '/:/' '{*}' && paste -d, xx* && rm xx*

Use sed to remove blank lines and unwanted spaces and literals.

Use csplit to split file into separate parts ie xx00 ...

Use paste to combine the separates parts back into one using a comma as field separator.

Clean up left over files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM