简体   繁体   中英

AWK remove blank lines and append empty columns to all csv files in the directory

Hi I am looking for a way to combine all the below commands together.

  1. Remove blank lines in the csv file (comma delimited)
  2. Add multiple empty columns to each line up to 100th column
  3. Perform action 1 & 2 on all the files in the folder

I am still learning and this is the best I could get:

awk '!/^[[:space:]]*$/' x.csv > tmp && mv tmp x.csv
awk -F"," '($100="")1' OFS="," x.csv > tmp && mv tmp x.csv

They work out individually but I don't know how how to put them together and I am looking for ways to have it run through all the files under the directory.

Looking for concrete AWK code or shell script calling AWK. Thank you!

An example input would be:

a,b,c

x,y,z

Expected output would be:

a,b,c,,,,,,,,,,
x,y,z,,,,,,,,,,

you can combine in one script without any loops

$ awk 'BEGIN{FS=OFS=","} FNR==1{close(f); f=FILENAME".updated"} NF{$100=""; print > f}' files...

it won't overwrite the original files.

You can pipe the output of the first to the other:

awk '!/^[[:space:]]*$/' x.csv | awk -F"," '($100="")1' OFS="," > new_x.csv

If you wanted to run the above on all the files in your directory, you would do:

shopt -s nullglob
for f in yourdirectory/*.csv; do
  awk '!/^[[:space:]]*$/' "${f}" | awk -F"," '($100="")1' OFS="," > new_"${f}"
done

The shopt -s nullglob is so that an empty directory won't give you a literal * . Quoted from a good source for about looping through files

With recent enough GNU awk you could:

$ gawk -i inplace 'BEGIN{FS=OFS=","}/\S/{NF=100;$1=$1;print}' *

Explained:

$ gawk -i inplace '   # using GNU awk and in-place file editing
BEGIN {
    FS=OFS=","        # set delimiters to a comma
}
/\S/ {                # gawk specific regex operator that matches any character that is not a space
    NF=100            # set the field count to 100 which truncates fields above it
    $1=$1             # edit the first field to rebuild the record to actually get the extra commas
    print             # output records
}' *

Some test data (the first empty record is empty, the second empty record has a space and a tab, trust me bro):

$ cat file
1,2,3

  
1,2,3,4,5,6,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101

Output of cat file after the execution of the GNU awk program:

1,2,3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2,3,4,5,6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM