简体   繁体   中英

grep multi column, in order or awk better?

Linux Debian Testing 64.

I wish to grep or awk the following...

ExifListAll = (below)

DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

I'll use Column 3 time 12:54:33 to start, search for 1 second before and 1 second after, Column 4 = "On" and Column 5 = 1, 2, or 3

I've tried this so far;

echo "$ExifListAll" | grep -E '2014-07-21.*12:45:3[3-4].*On.*[1-3]'

Can I use an awk 1 liner more efficiantly ?

Am I doing this correctly ?

echo "$ExifListAll" | awk '$4 == "On" && $5~/1/,$5~/3/'

Thank you.

grep will work fine for your purposes. You are just having a challenge with the syntax. Primarily, it is easier to use the pattern \\s* to match zero or more spaces between fields. You are using .* which (since regular expressions are greedy) will match every character to the end of the line. Also, character classes mean characters contained within. Ie to match 1, 2, or 3, use [123] . With those changes, the following accomplishes what your intent appears to be:

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"

output:

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3

Is this not the output you were expecting? 12:54:34 had Off & a 0 which I interpreted from your question as not wanted. If you want the states On/Off regardless, and included the 0` corresponding to 12:54:34 Off 0, then use:

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"

output:

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

per comment that lines 1-6 are desired:

cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"

output

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"
DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3

You can NOT use range or flag to retrieve more than one rows which matched the /end/ block. For a more general solution with awk, you can convert the time to epoch time and then set up the comparison:

mydatetime="2014-07-21 12:54:33"
awk -v expected_time=$(date -d"$mydatetime" +%s) '
  { t = $2" "$3; gsub(/[:-]/," ",t); t1 = mktime(t) }
  t1 >= expected_time-1 && t1 <= expected_time+1 && $4 =="On" && $5 ~ /^[123]$/
' file.txt

Note:

  1. line-1: setup the expected_time to be epoch timestamp with the -v expected_time=$(...)
  2. convert the entrytime ($2" "$3) of each record into the format "YYYY mm dd HH MM SS" and then feed into mktime() to generate epoch timestamp with awk.
  3. compare the time and make sure $4 is 'On' and $5 is 1, 2, or 3.

If you know exactly the expected_time as you mentioned, then just use your grep line, much simpler and faster than the awk one.

grep -E '2014-07-21.*12:54:3[2-4].*On.*[1-3]' file.txt

Thank you all for your suggestions.

I have used an alternate more direct method using 'exiftool' It reads all the metadata from images.

I selected any image in a directory, then give the previous 1 second and the next one second. I'm not sure yet how to substitue the info provided but I will sort it out from your help.

DateTimeOrigFirst="$(exiftool -T -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecMinus="$(exiftool -T -globalTimeShift "-0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecPlus="$(exiftool -T -globalTimeShift "+0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"

I can then produce images 1-6 in my 1st example with;

printf %s\\n "$ExifListAll" | tr '\t' ' ' | grep \
-E "$DateTimeOrigFirst|$DateTimeOrig1SecMinus|$DateTimeOrig1SecPlus"

Thanks again.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM