[英]change the date formats to modified ISO date in multiple column in each line using awk or sed
Input file输入文件
xyz|name1|address1|19600221|M|country1|20200129|etc1
xyz|name2|address2|19610321|M|country2|20200118|etc1
xyz|name3|address3|19520217|M|country3||etc1
xyz|name4|address4|19611111|M|country4||etc1
expected output预期产出
xyz|name1|address1|1960-02-21|M|country1|2020-01-29|etc1
xyz|name2|address2|1961-03-21|M|country2|2020-01-18|etc1
xyz|name3|address3|1952-02-17|M|country3||etc1
xyz|name4|address4|1961-11-11|M|country4||etc1
Code that I used我使用的代码
awk -F"|" '{OFS="|";$4=strftime("%Y-%m-%d", $4); print$0}' input.txt
and redirect the output to new file and ran the same for the 7th column but the result is not as expected and I got the below并将输出重定向到新文件并为第 7 列运行相同的结果,但结果与预期不同,我得到了以下结果
xyz|name1|address1|1960-02-21|M|country1|1970-08-22|etc1
xyz|name2|address2|1961-03-21|M|country2|1970-08-22|etc1
xyz|name3|address3|1952-02-17|M|country3|1970-01-01|etc1
xyz|name4|address4|1961-11-11|M|country4|1970-01-01|etc1
I couldn't understand why the column 7 output is different?我不明白为什么第 7 列的输出不同? Any advise on what is wrong here?关于这里有什么问题的任何建议?
Could you please try following.你能不能试试以下。 Written and tested with shown samples in GNU awk
.使用 GNU awk
显示的示例编写和测试。
awk '
BEGIN{
FS=OFS="|"
}
{
$7=($7!=""?substr($7,1,4)"-"substr($7,5,2)"-"substr($7,7):"")
}
1' Input_file
Explanation: Adding detailed explanation for above.说明:为以上添加详细说明。
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
FS=OFS="|" ##Starting field separator and output field separator as | here.
}
{
$7=($7!=""?substr($7,1,4)"-"substr($7,5,2)"-"substr($7,7):"") ##Checking condition if 7th field is NOT NULL then using sub string to make them into the date exact format.
}
1 ##1 will print current edited/non-edited line here.
' Input_file ##Mentioning Input_file name here.
NOTE: In case you have one or multiple columns which have dates and you want to change them to yyyy-mm-dd
format etc then you could use a for loop like for (i=1;i<=NF;i++){if($i~/^[0-9]{8}$/){substr(..code above)...}
注意:如果您有一个或多个包含日期的列,并且您想将它们更改为yyyy-mm-dd
格式等,那么您可以使用 for 循环,例如for (i=1;i<=NF;i++){if($i~/^[0-9]{8}$/){substr(..code above)...}
You could use sed for this:您可以为此使用sed :
$ sed -E 's/([0-9]{4})([0-9]{2})([0-9]{2})/\1-\2\-\3/g' input.txt
xyz|name1|address1|1960-02-21|M|country1|2020-01-29|etc1
xyz|name2|address2|1961-03-21|M|country2|2020-01-18|etc1
xyz|name3|address3|1952-02-17|M|country3||etc1
xyz|name4|address4|1961-11-11|M|country4||etc1
This is not an answer, but an extended comment.这不是答案,而是扩展评论。
@thanasisp's comment is on the mark. @thanasisp 的评论是正确的。 @RavinderSingh13's answer shows how to split apart and then join the date pieces into the desired format. @RavinderSingh13 的回答显示了如何拆分然后将日期片段合并为所需的格式。 If you want to use the time functions you still need to do that:如果你想使用时间函数,你仍然需要这样做:
# reformatdate.awk
BEGIN {FS = OFS = "|"}
function formatDate(d, t) {
t = mktime(substr(d,1,4) " " substr(d,5,2) " " substr(d,7,2) " 0 0 0")
return strftime("%Y-%m-%d", t)
}
{
$4 = formatDate($4)
if ($7) $7 = formatDate($7)
print
}
Then然后
$ gawk -f reformatdate.awk input.txt
xyz|name1|address1|1960-02-21|M|country1|2020-01-29|etc1
xyz|name2|address2|1961-03-21|M|country2|2020-01-18|etc1
xyz|name3|address3|1952-02-17|M|country3||etc1
xyz|name4|address4|1961-11-11|M|country4||etc1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.