[英]Format and replace timestamp column using awk
我有以下格式的多列
D,"4/2/2017 2:45:56 PM",ee,"4/2/2017 2:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/2/2017 6:05:54 AM",ee,"09/2/2017 6:05:54 AM"
D,"5/01/2017 8:29:46 PM",ee,"5/01/2017 8:29:46 PM"
D,"4/2/2017 02:3:26 AM",ee,"4/2/2017 02:3:26 AM"
我想格式化它們如下
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
我嘗試使用awk -F“ [,/:]”分隔列,然后根據長度進行處理
但是當有多個列時,它變得很乏味。
請建議awk中是否有任何日期時間或時間戳格式設置選項,以便我可以按列處理,這會很快
$ cat tst.awk
function fmt(t, f) {
split(t,f,/["\/ :]/)
return sprintf("\"%02d/%02d/%04d %02d:%02d:%02d %s\"",f[2],f[3],f[4],f[5],f[6],f[7],f[8])
}
BEGIN { FS=OFS="," }
{ $2=fmt($2); $4=fmt($4); print }
$ awk -f tst.awk file
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
我建議使用awk
及其printf
格式化輸出:
awk -F '["/ :]' '{printf "%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"%s\"%.2d/%.2d/%d %.2d:%.2d:%.2d %s\"\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16}' file
輸出:
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM" D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM" D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM" D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM" D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
使用GNU awk(用seps
split
)。 碼:
function doit(str, b) { # b is a local var buffer
gsub(/\"/,"",str); # remove quotes
n=split(str,a,"[/ :]",seps); # split on special chars
for(j=1;j<=n;j++) { # loop all elements in a
if(a[j]~/^[0-9]+$/) # process all number elements
a[j]=sprintf("%02d", a[j]) seps[j]; # zeropad
b=b a[j] # gather buffer
}
return "\"" b "\"" # return quoted
}
BEGIN { FS=OFS="," }
{
for(i=2;i<=NF;i+=2) # loop the right ones
$i=doit($i) # call the contractor
}
1
運行:
$ awk -f program.awk file
輸出:
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
您也可以使用sed
,以將要填充的單詞邊界之間的所有單個數字替換為0
。 但是,即使不在日期列中,它也會更改數據中的任何一位數字。 因此, 僅當您要替換所有帶0
的數字時, 才使用它
sed 's|\b\([[:digit:]]\)\b|0\1|g'
如果要永久保留更改,請在-s中使用-i
。
這個怎么運作。
正則表達式\\b\\([[:digit:]]\\)\\b
將匹配單詞邊界之間的單個數字,並用(braces)
捕獲。 現在,在replace
sed
一部分中,用第一個匹配的模式\\1
硬編碼0
將為您0
填充一位數字。
正則表達式演示
要查看此正則表達式的工作原理,請參見regex演示
工作示例:
bash-4.2$ cat file1
D,"4/2/2017 2:45:56 PM",ee,"4/2/2017 2:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/2/2017 6:05:54 AM",ee,"09/2/2017 6:05:54 AM"
D,"5/01/2017 8:29:46 PM",ee,"5/01/2017 8:29:46 PM"
D,"4/2/2017 02:3:26 AM",ee,"4/2/2017 02:3:26 AM"
bash-4.2$ sed -i 's|\b\([[:digit:]]\)\b|0\1|g' file1
bash-4.2$ cat file1
D,"04/02/2017 02:45:56 PM",ee,"04/02/2017 02:45:56 PM"
D,"03/02/2017 03:47:16 PM",ee,"03/02/2017 03:47:16 PM"
D,"09/02/2017 06:05:54 AM",ee,"09/02/2017 06:05:54 AM"
D,"05/01/2017 08:29:46 PM",ee,"05/01/2017 08:29:46 PM"
D,"04/02/2017 02:03:26 AM",ee,"04/02/2017 02:03:26 AM"
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.