簡體   English   中英

使用awk格式化和替換時間戳記列

[英]Format and replace timestamp column using awk

我有以下格式的多列

D,"4/2/2017 2:45:56 PM",ee,"4/2/2017 2:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/2/2017 6:05:54 AM",ee,"09/2/2017 6:05:54 AM"
D,"5/01/2017 8:29:46 PM",ee,"5/01/2017 8:29:46 PM"
D,"4/2/2017 02:3:26 AM",ee,"4/2/2017 02:3:26 AM"

我想格式化它們如下

D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

我嘗試使用awk -F“ [,/:]”分隔列,然后根據長度進行處理

但是當有多個列時,它變得很乏味。

請建議awk中是否有任何日期時間或時間戳格式設置選項,以便我可以按列處理,這會很快

$ cat tst.awk
function fmt(t,    f) {
    split(t,f,/["\/ :]/)
    return sprintf("\"%02d/%02d/%04d %02d:%02d:%02d %s\"",f[2],f[3],f[4],f[5],f[6],f[7],f[8])
}
BEGIN { FS=OFS="," }
{ $2=fmt($2); $4=fmt($4); print }

$ awk -f tst.awk file
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

我建議使用awk及其printf格式化輸出:

awk -F '["/ :]' '{printf "%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16}' file

輸出:

D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

使用GNU awk(用seps split )。 碼:

function doit(str,    b) {                      # b is a local var buffer
    gsub(/\"/,"",str);                          # remove quotes
    n=split(str,a,"[/ :]",seps);                # split on special chars
    for(j=1;j<=n;j++) {                         # loop all elements in a
        if(a[j]~/^[0-9]+$/)                     # process all number elements
            a[j]=sprintf("%02d", a[j]) seps[j]; # zeropad
        b=b a[j]                                # gather buffer
    }
    return "\"" b "\""                          # return quoted
}
BEGIN { FS=OFS="," }
{
    for(i=2;i<=NF;i+=2)                         # loop the right ones
        $i=doit($i)                             # call the contractor
}
1

運行:

$ awk -f program.awk file

輸出:

D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

您也可以使用sed ,以將要填充的單詞邊界之間的所有單個數字替換為0 但是,即使不在日期列中,它也會更改數據中的任何一位數字。 因此, 當您要替換所有帶0的數字時, 使用它

sed 's|\b\([[:digit:]]\)\b|0\1|g'

如果要永久保留更改,請在-s中使用-i

這個怎么運作。

正則表達式\\b\\([[:digit:]]\\)\\b將匹配單詞邊界之間的單個數字,並用(braces)捕獲。 現在,在replace sed一部分中,用第一個匹配的模式\\1硬編碼0將為您0填充一位數字。

正則表達式演示

要查看此正則表達式的工作原理,請參見regex演示

工作示例:

bash-4.2$ cat file1
D,"4/2/2017 2:45:56 PM",ee,"4/2/2017 2:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/2/2017 6:05:54 AM",ee,"09/2/2017 6:05:54 AM"
D,"5/01/2017 8:29:46 PM",ee,"5/01/2017 8:29:46 PM"
D,"4/2/2017 02:3:26 AM",ee,"4/2/2017 02:3:26 AM"

bash-4.2$ sed -i 's|\b\([[:digit:]]\)\b|0\1|g' file1

bash-4.2$ cat file1
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM