简体   繁体   English

我如何使用sed或awk来更正日志时间格式

[英]how can I use sed or awk to correct a log time format

I am currently working on a script which processes a csv file, and corrects certain aspects of them along the way. 我目前正在处理一个处理csv文件的脚本,并在整个过程中纠正它们的某些方面。 One of the things that it does is correct time format if needed. 如果需要,它所做的一件事是正确的时间格式。 Two type of conversion takes place: 发生两种类型的转换:

 xx:xx:xx to PTxxHxxMxxS
 10:03:45 to PT10H03M45S

I've have been able to do this using the following (see below) although I am trying to find out how to do it either using sed or awk in order to speed up the process. 我已经能够使用以下(见下文)做到这一点,虽然我试图找到如何使用sed或awk来加快进程。 In addition to the actual conversion process, I would also like to keep count of the changes that are made (so say 4 times values are converted, a counter would be incremented to 4), which I have been able to do easily with the if statement below (though it is not shown), although I would not know much about doing that using sed/awk. 除了实际的转换过程之外,我还想继续计算所做的更改(所以说4次转换值,计数器将增加到4),我已经能够轻松地使用if下面的声明(虽然没有显示),虽然我不太了解使用sed / awk这样做。

 istimef=$( echo "$Sfcpp6" | grep ".*:.*:.*" )
                    if [ "$istimef" != "" ]; then
                            hs=$( echo "$Sfcpp6" | cut -d ':' -f 1 )
                            mn=$( echo "$Sfcpp6" | cut -d ':' -f 2 )
                            sc=$( echo "$Sfcpp6" | cut -d ':' -f 3 )
                            Sfcpp6=$( echo "PT"$hs"H"$mn"M"$sc"S" )
                            echo "$Sfcp6"
                    fi

which essentially checks if the time value is even there, and then performs the conversion. 它基本上检查时间值是否均匀,然后执行转换。

It amazing how much processes and subshells you need for this task! 令人惊讶的是,您完成此任务需要多少进程和子shell! I'll always be amazed at people's ingenuity and creativity. 我会一直对人们的聪明才智和创造力感到惊讶。 I counted 10 subshells, and 4 process spawns. 我计算了10个子壳,4个过程产生。

Look, you can achieve exactly the same without spawning one process and with no subshell whatsoever. 看,你可以实现完全相同,而不会产生一个进程,也没有子shell。 We're talking about speed-up here! 我们在谈论加速!

First task, given a string in the form xx:yy:zz , transform it into PTxxHyyMzzS as efficiently as possible (look, in only one command! and a builtin! no sed !): 第一个任务,给定一个xx:yy:zz形式的字符串,尽可能有效地将其转换为PTxxHyyMzzS (看,只有一个命令!和内置!没有sed !):

$ string='12:34:56'
$ printf -v transformed 'PT%sH%sM%sS' ${string//:/ }
$ # Done! Don't believe me?
$ echo "$transformed"
PT12H34M56S

Now, before doing this, you probably want to check if the string is of the form xx:yy:zz . 现在,在执行此操作之前,您可能想要检查字符串是否为xx:yy:zz格式。 Quit grep for that. 为此退出grep Just test it thus: 只需测试一下:

if [[ "$string" = *:*:* ]]; then
    echo "ok"
else
    echo "not ok"
fi

So the part of your script you showed us would be much more efficient thus: 因此,您向我们展示的脚本部分将更加高效:

if [[ "$Sfcpp6" = *:*:* ]]; then
    printf -v Sfcp6 'PT%sH%sM%sS' ${Sfcpp6//:/ }
    echo "$Sfcp6"
fi

Total: 0 subshells, 0 processes spawned. 总计:0个子壳,产生0个过程。

Or if your goal is only to echo the transformed string: 或者,如果您的目标只是回显变换后的字符串:

if [[ "$Sfcpp6" = *:*:* ]]; then
    printf 'PT%sH%sM%sS\n' ${Sfcpp6//:/ }
fi

sed解决方案:使用\\(...\\)来捕获数字,字符类[0-9]以匹配任何数字。

sed 's/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\)/PT\1H\2M\3S/'

If you want to count the substituted lines : 如果要计算替换行:

perl -pe '
    END{print "count=$count\n"}
    s/(\d{2}):(\d{2}):(\d{2})/PT$1H$2M$3S/ && $count++
' file.txt

The GNU awk equivalent of this sed solution posted by @choroba: @choroba发布的这个sed解决方案的GNU awk相当于:

sed 's/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9]\)/PT\1H\2M\3S/'

would be the very similar: 将是非常相似的:

awk '{print gensub(/([0-9][0-9]):([0-9][0-9]):([0-9][0-9])/,"PT\\1H\\2M\\3S","")}'

but the awk solution can be trivially modified to address your question of "would it be possible to make sed keep count of the changes that It has made?": 但awk解决方案可以通过简单的修改来解决你的问题“是否有可能让sed继续计算它所做的更改?”:

awk '{orig=$0; $0=gensub(/([0-9][0-9]):([0-9][0-9]):([0-9][0-9])/,"PT\\1H\\2M\\3S",""); print} $0 != orig{count++} END{printf "%d changes made.\n",count}'

while the sed solution can't. 虽然sed解决方案不能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM