简体   繁体   English

grep多列,顺序还是awk更好?

[英]grep multi column, in order or awk better?

Linux Debian Testing 64. Linux Debian测试64。

I wish to grep or awk the following... 我希望grep或awk以下内容...

ExifListAll = (below) ExifListAll =(下面)

DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

I'll use Column 3 time 12:54:33 to start, search for 1 second before and 1 second after, Column 4 = "On" and Column 5 = 1, 2, or 3 我将使用第3列时间12:54:33开始,搜索1秒钟之前和之后的1秒钟,第4列=“打开”,第5列= 1、2或3

I've tried this so far; 到目前为止,我已经尝试过了;

echo "$ExifListAll" | grep -E '2014-07-21.*12:45:3[3-4].*On.*[1-3]'

Can I use an awk 1 liner more efficiantly ? 我可以更有效地使用awk 1衬板吗?

Am I doing this correctly ? 我这样做正确吗?

echo "$ExifListAll" | awk '$4 == "On" && $5~/1/,$5~/3/'

Thank you. 谢谢。

grep will work fine for your purposes. grep可以正常工作。 You are just having a challenge with the syntax. 您只是在语法上遇到了挑战。 Primarily, it is easier to use the pattern \\s* to match zero or more spaces between fields. 首先,更容易使用模式\\s*来匹配字段之间的零个或多个空格。 You are using .* which (since regular expressions are greedy) will match every character to the end of the line. 您正在使用.* (因为正则表达式很贪婪),它将使每个字符都匹配到该行的末尾。 Also, character classes mean characters contained within. 同样,字符类是指其中包含的字符。 Ie to match 1, 2, or 3, use [123] . 即匹配1、2或3,请使用[123] With those changes, the following accomplishes what your intent appears to be: 通过这些更改,以下内容可以实现您的意图:

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"

output: 输出:

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3

Is this not the output you were expecting? 这不是您期望的输出吗? 12:54:34 had Off & a 0 which I interpreted from your question as not wanted. 12:54:34的Off0 ,我从您的问题中解释为不想要。 If you want the states On/Off regardless, and included the 0` corresponding to 12:54:34 Off 0, then use: 如果您希望状态为On/Off regardless, and included the对应于12:54:34关0的0`,请使用:

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"

output: 输出:

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

per comment that lines 1-6 are desired: 每个注释都需要1-6行:

cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"

output 产量

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"
DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3

You can NOT use range or flag to retrieve more than one rows which matched the /end/ block. 您不能使用范围或标志来检索多个与/ end /块匹配的行。 For a more general solution with awk, you can convert the time to epoch time and then set up the comparison: 对于使用awk的更通用的解决方案,您可以将时间转换为纪元时间,然后进行比较:

mydatetime="2014-07-21 12:54:33"
awk -v expected_time=$(date -d"$mydatetime" +%s) '
  { t = $2" "$3; gsub(/[:-]/," ",t); t1 = mktime(t) }
  t1 >= expected_time-1 && t1 <= expected_time+1 && $4 =="On" && $5 ~ /^[123]$/
' file.txt

Note: 注意:

  1. line-1: setup the expected_time to be epoch timestamp with the -v expected_time=$(...) 第1行:使用-v Expected_time = $(...),将Expected_time设置为纪元时间戳。
  2. convert the entrytime ($2" "$3) of each record into the format "YYYY mm dd HH MM SS" and then feed into mktime() to generate epoch timestamp with awk. 将每个记录的条目时间($ 2“” $ 3)转换为格式“ YYYY mm dd HH MM SS”,然后输入mktime()以生成带有awk的纪元时间戳。
  3. compare the time and make sure $4 is 'On' and $5 is 1, 2, or 3. 比较时间并确保$ 4为“开”,并且$ 5为1、2或3。

If you know exactly the expected_time as you mentioned, then just use your grep line, much simpler and faster than the awk one. 如果您确切地知道了您所提到的Expected_time,那么只使用grep行,它比awk行更简单,更快。

grep -E '2014-07-21.*12:54:3[2-4].*On.*[1-3]' file.txt

Thank you all for your suggestions. 谢谢大家的建议。

I have used an alternate more direct method using 'exiftool' It reads all the metadata from images. 我已经使用了另一种更直接的方法,即'exiftool',它从图像中读取所有元数据。

I selected any image in a directory, then give the previous 1 second and the next one second. 我选择了目录中的任何图像,然后给出前一秒和下一秒。 I'm not sure yet how to substitue the info provided but I will sort it out from your help. 我不确定如何替换所提供的信息,但是我会在您的帮助下进行整理。

DateTimeOrigFirst="$(exiftool -T -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecMinus="$(exiftool -T -globalTimeShift "-0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecPlus="$(exiftool -T -globalTimeShift "+0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"

I can then produce images 1-6 in my 1st example with; 然后,我可以在第一个示例中使用1-6生成图像1-6;

printf %s\\n "$ExifListAll" | tr '\t' ' ' | grep \
-E "$DateTimeOrigFirst|$DateTimeOrig1SecMinus|$DateTimeOrig1SecPlus"

Thanks again. 再次感谢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM