I have a script that pulls in comma delimited information from a file and prepares update statement. The file is setup to look like below:
ID,NAME,DATE,TIME,HOURS,EMPNUMBER
1, Joe, 12/11, 12:45, 5, 333
2, John, 12/12, 16:45, 7, 666
My script takes a file as the parameter and is run from the command line like below:
./runScript.sh file.csv
My script code is below:
for i in ` cat $1 | grep -v “EMPNUMBER” | cut -d',' -f4,5`
do
time=`echo $i | cut -d',' -f1`
hours=`echo $i | cut -d',' -f2`
echo "update jobs.j j set j.time= $time where j.hours=$hours;"
done
I'm just curious why, when I run my script that it skips over the top line in my file which is the header information. Obviously, this is the desired effect but in order for my learning to progress I need to understand why the first line is skipped from the file.
Can someone assist in my understanding? ~
If you are learning bash, then in addition to explaining that grep -v "EMPNUM"
is what causes the header to be skipped (the -v
option meaning find lines that do not include EMPNUM
), there are a few other items to point out. First, good bash code utilizes the tools bash provides for reading input and parsing data rather than relying on spawning subshells to run additional programs (ie cat, grep, cut
).
Note: there is nothing wrong with using cat, grep, cut
, but recognizing that bash itself provides tools that do exactly what you are using those 3 other programs for will strengthen your programming skills.
First, bash provides the builtin read
for reading data from stdin
or any other file. To read a file, you generally see while read var1 var2; do... done <"filename"
while read var1 var2; do... done <"filename"
instead of for i in $(cat file)
-- for many reasons. Next, rather than calling cut...
, bash provides parameter expansion/substring extraction
to handle parsing any line of text into any individual variables. Further, by choosing variables to accompany read
wisely, you can eliminate needing to use substring extraction
entirely.
The following shows the use of the bash alternatives to the cat, grep, cut
approach shown in your example. If you are interested in learning bash
, give it a look and let me know if you have any questions. You can use echo
and printf
interchangeably for output. While echo
is generally simpler, printf
provides a number of advantages. It is worth learning both...
#!/bin/bash
## set the datafile name (defaults to 'dat/empdata.dat')
dfn="${1:-dat/empdata.dat}"
## validate that file is readable
[ -r "$dfn" ] || {
printf "\n error: file not readable '%s'. Usage: %s [filename (dat/empdata.dat)]\n\n" "$dfn" "${0//*\//}"
exit 1
}
## simple output header for data
printf "\nEmployee data read from file: '%s'\n\n" "$dfn"
## read each line in file, skipping header (where $id = ID)
# IFS is set to include ',' in addition to default ' \t\n'
while IFS=$' ,\t\n' read -r id nm dt tm hrs eno || [ -n "$hrs" ]; do
# if header row - skip
[ "$id" = "ID" ] && continue
# print out each of the values for the employee
printf "ID: %s NAME: %-4s DATE: %s TIME: %s HOURS: %s EMPNUMBER: %s\n" \
"$id" "$nm" "$dt" "$tm" "$hrs" "$eno"
done <"$dfn"
input file:
$ cat dat/empdata.dat
ID,NAME,DATE,TIME,HOURS,EMPNUMBER
1, Joe, 12/11, 12:45, 5, 333
2, John, 12/12, 16:45, 7, 666
output:
$ bash empdata.sh
Employee data read from file: 'dat/empdata.dat'
ID: 1 NAME: Joe DATE: 12/11 TIME: 12:45 HOURS: 5 EMPNUMBER: 333
ID: 2 NAME: John DATE: 12/12 TIME: 16:45 HOURS: 7 EMPNUMBER: 666
using awk, i tried
awk -F ',' '{if(NR==1) for(i=1;i<=NF;i++) a[i]=$i}{if(NR>=2)for(i=1;i<=NF;i++) printf("%s:%s\t",a[i],$i)}{printf("\n")}' file.txt
output:
ID:1 NAME: Joe DATE: 12/11 TIME: 12:45 HOURS: 5 EMPNUMBER: 333
ID:2 NAME: John DATE: 12/12 TIME: 16:45 HOURS: 7 EMPNUMBER: 666
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.