简体   繁体   中英

Writing a bash script without using awk?

We are asked to write a shell script which takes in input of player records as standard inputs.

Sample input:

id|name|time
23|Jordan|45:17
14|Jason|4:50
12|Bryan|
24|Cody|00:12
33|kobe|41
55|rocky|0

And we need to read each record ( skip header ) in our script and then output each corresponding record by converting time in seconds and changing delimiter from '|' to ' ' ( space ).

As we can see in above sample testcase , some records have empty time field ( 3rd record ), for those record time in seconds will be considered 0 .

Sample Case Output:

23 Jordan 2717
14 Jason 290
12 Bryan 0
24 Cody 12
33 kobe 2460
55 rocky 0

my_solution_script.sh

#!/bin/bash


read -r header
while IFS="|" read -r pid pname time || [[ -n $pid ]]
do  
    min=$(cut -d ':' -f 1 <<< "$time")
    sec=$(cut -d ':' -f 2 <<< "$time")
    ((min*=60))
    ((min+=sec))
    
    echo "$pid $pname $min"
done

Wrong Output:

23 Jordan 2717
14 Jason 290
12 Bryan 0
24 Cody 12
33 kobe 2501
55 rocky 0

As we can see above script is giving wrong output for 5th record .

How can I fix the above script , to get the correct output in every case?

I think there might be a simpler solution possible using awk , but I don't have much idea about 'awk scripting' right now, so I am looking for a way to solve this question using basic shell commands , nevertheless awk command solutions are also welcome.

Thank You.

The problem is that cut -d: -f2 <<< "$time" returns all of $time when it doesn't contain a : delimiter. So for kobe you're calculating 41*60+41 instead of just 41*60 .

So you need to check whether $time contains a : before trying to get the seconds.

read -r header
while IFS="|" read -r pid pname time || [[ -n $pid ]]
do  
    min=$(cut -d ':' -f 1 <<< "$time")
    if [[ $time =~ : ]]
        sec=$(cut -d ':' -f 2 <<< "$time")
    else
        sec=0
    fi
    ((min*=60))
    ((min+=sec))
    
    echo "$pid $pname $min"
done

With GNU awk:

awk 'NR>1{$3=$3*60+$4; NF=3; print}' FS='[|:]' file

Output:

23 Jordan 2717
14 Jason 290
12 Bryan 0
24 Cody 12
33 kobe 2460
55 rocky 0

NF=3 limits GNU awk's print to three columns.


See: 8 Powerful Awk Built-in Variables – FS , OFS, RS, ORS, NR , NF , FILENAME, FNR

Could you please try following. Written and tested in https://ideone.com/9RkGvJ

awk '
BEGIN{
  FS="|"
}
FNR==1{  next  }
{
  split($3,arr,":")
  $3=(arr[1]*60)+arr[2]
}
1;
' Input_file

Explanation: Setting field separator as | for all lines. Then checking FNR==1 in which putting next will skip that line. Then on each line splitting 3rd column with : separator and re-creating 3rd field which has 1st element of array multiply with 60 and adding it's 2nd element to it get seconds value in 3rd column. Then mentioning 1 will print lines.

An immediate fix is to set minutes to zero if there is no colon in the value. You can avoid the ugly and moderately expensive external processes entirely.

sec=${time#*}
min=${time%:"$sec"}
min=${min:-0}

This uses the shell's built-in parameter expansion facility to pick apart the value. Briefly, ${time#pattern} returns the value of $time without any prefix which matches pattern ; the % operator does the same for suffixes.

Using an Awk script is almost certainly a better idea; you should be able to learn the basics in less than an hour, perhaps already enough to solve this problem yourself. Here's a quick and dirty untested attempt.

awk -F '|' 'NR>1 && ($3 ~ /:/) {m = s = $3;
    sub(/:.*/, "", m); sub(/.*:/, "", s);
    $3 = m*60+s } 1'

bash:

{
  read header
  while IFS='|' read -r id name time; do
    IFS=':' read -r mins secs <<<"$time"
    echo "$id $name $((60 * 10#$mins + 10#$secs))"
  done
} < file

We're using the pattern IFS=delim read -r field1 field2... twice here to do the parsing.

The 10# in the arithmetic expression is to force the values to be interpreted as base-10 numbers. Otherwise, 08 and 09 will be interpreted as invalid octal numbers due to the leading zero.

Assuming the other fields won't have an embedded colon, you can parse min & secs on the read, then use parameter parsing default zeroes for empty min or secs. You can also do all the math in one pass, inside the echo .

read -r header
while IFS="|:" read -r pid pname min secs || [[ -n $pid ]]
do echo "$pid $pname $(( 10#${secs:-0} + 10#${min:-0}*60 ))"
done

If names can have colons, this doesn't work.

As pointed out, leading zeroes would also cause issues, so I added a base-selection indicator ( 10# ) to assure base-10 math. c.f. https://mywiki.wooledge.org/ArithmeticExpression

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM