grep多列，顺序还是awk更好？

Question

Linux Debian Testing 64. Linux Debian测试64。

I wish to grep or awk the following... 我希望grep或awk以下内容...

ExifListAll = (below) ExifListAll =（下面）

DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

I'll use Column 3 time 12:54:33 to start, search for 1 second before and 1 second after, Column 4 = "On" and Column 5 = 1, 2, or 3 我将使用第3列时间12:54:33开始，搜索1秒钟之前和之后的1秒钟，第4列=“打开”，第5列= 1、2或3

I've tried this so far; 到目前为止，我已经尝试过了；

echo "$ExifListAll" | grep -E '2014-07-21.*12:45:3[3-4].*On.*[1-3]'

Can I use an awk 1 liner more efficiantly ? 我可以更有效地使用awk 1衬板吗？

Am I doing this correctly ? 我这样做正确吗？

echo "$ExifListAll" | awk '$4 == "On" && $5~/1/,$5~/3/'

Thank you. 谢谢。

Answer 1

grep will work fine for your purposes. grep可以正常工作。 You are just having a challenge with the syntax. 您只是在语法上遇到了挑战。 Primarily, it is easier to use the pattern \\s* to match zero or more spaces between fields. 首先，更容易使用模式\\s*来匹配字段之间的零个或多个空格。 You are using .* which (since regular expressions are greedy) will match every character to the end of the line. 您正在使用.* （因为正则表达式很贪婪），它将使每个字符都匹配到该行的末尾。 Also, character classes mean characters contained within. 同样，字符类是指其中包含的字符。 Ie to match 1, 2, or 3, use [123] . 即匹配1、2或3，请使用[123] 。 With those changes, the following accomplishes what your intent appears to be: 通过这些更改，以下内容可以实现您的意图：

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"

output: 输出：

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3

Is this not the output you were expecting? 这不是您期望的输出吗？ 12:54:34 had Off & a 0 which I interpreted from your question as not wanted. 12:54:34的Off和0 ，我从您的问题中解释为不想要。 If you want the states On/Off regardless, and included the 0` corresponding to 12:54:34 Off 0, then use: 如果您希望状态为On/Off regardless, and included the对应于12:54:34关0的0`，请使用：

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"

output: 输出：

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

per comment that lines 1-6 are desired: 每个注释都需要1-6行：

cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"

output 产量

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"
DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3

Answer 2

You can NOT use range or flag to retrieve more than one rows which matched the /end/ block. 您不能使用范围或标志来检索多个与/ end /块匹配的行。 For a more general solution with awk, you can convert the time to epoch time and then set up the comparison: 对于使用awk的更通用的解决方案，您可以将时间转换为纪元时间，然后进行比较：

mydatetime="2014-07-21 12:54:33"
awk -v expected_time=$(date -d"$mydatetime" +%s) '
  { t = $2" "$3; gsub(/[:-]/," ",t); t1 = mktime(t) }
  t1 >= expected_time-1 && t1 <= expected_time+1 && $4 =="On" && $5 ~ /^[123]$/
' file.txt

Note: 注意：

line-1: setup the expected_time to be epoch timestamp with the -v expected_time=$(...) 第1行：使用-v Expected_time = $（...），将Expected_time设置为纪元时间戳。
convert the entrytime ($2" "$3) of each record into the format "YYYY mm dd HH MM SS" and then feed into mktime() to generate epoch timestamp with awk. 将每个记录的条目时间（$ 2“” $ 3）转换为格式“ YYYY mm dd HH MM SS”，然后输入mktime（）以生成带有awk的纪元时间戳。
compare the time and make sure $4 is 'On' and $5 is 1, 2, or 3. 比较时间并确保$ 4为“开”，并且$ 5为1、2或3。

If you know exactly the expected_time as you mentioned, then just use your grep line, much simpler and faster than the awk one. 如果您确切地知道了您所提到的Expected_time，那么只使用grep行，它比awk行更简单，更快。

grep -E '2014-07-21.*12:54:3[2-4].*On.*[1-3]' file.txt

Answer 3

Thank you all for your suggestions. 谢谢大家的建议。

I have used an alternate more direct method using 'exiftool' It reads all the metadata from images. 我已经使用了另一种更直接的方法，即'exiftool'，它从图像中读取所有元数据。

I selected any image in a directory, then give the previous 1 second and the next one second. 我选择了目录中的任何图像，然后给出前一秒和下一秒。 I'm not sure yet how to substitue the info provided but I will sort it out from your help. 我不确定如何替换所提供的信息，但是我会在您的帮助下进行整理。

DateTimeOrigFirst="$(exiftool -T -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecMinus="$(exiftool -T -globalTimeShift "-0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecPlus="$(exiftool -T -globalTimeShift "+0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"

I can then produce images 1-6 in my 1st example with; 然后，我可以在第一个示例中使用1-6生成图像1-6；

printf %s\\n "$ExifListAll" | tr '\t' ' ' | grep \
-E "$DateTimeOrigFirst|$DateTimeOrig1SecMinus|$DateTimeOrig1SecPlus"

Thanks again. 再次感谢。

grep多列，顺序还是awk更好？

问题描述

3 个解决方案

解决方案1
1 2014-09-08 04:42:06

解决方案2
1 2014-09-08 05:05:17

解决方案3
0 2014-09-09 06:46:05

grep多列，顺序还是awk更好？

问题描述

3 个解决方案

解决方案1 1 2014-09-08 04:42:06

解决方案2 1 2014-09-08 05:05:17

解决方案3 0 2014-09-09 06:46:05

解决方案1
1 2014-09-08 04:42:06

解决方案2
1 2014-09-08 05:05:17

解决方案3
0 2014-09-09 06:46:05