简体   繁体   English

使用 awk 或 sed 将日期格式更改为每行多列中的已修改 ISO 日期

[英]change the date formats to modified ISO date in multiple column in each line using awk or sed

Input file输入文件

xyz|name1|address1|19600221|M|country1|20200129|etc1
xyz|name2|address2|19610321|M|country2|20200118|etc1
xyz|name3|address3|19520217|M|country3||etc1
xyz|name4|address4|19611111|M|country4||etc1

expected output预期产出

xyz|name1|address1|1960-02-21|M|country1|2020-01-29|etc1
xyz|name2|address2|1961-03-21|M|country2|2020-01-18|etc1
xyz|name3|address3|1952-02-17|M|country3||etc1
xyz|name4|address4|1961-11-11|M|country4||etc1

Code that I used我使用的代码

awk -F"|" '{OFS="|";$4=strftime("%Y-%m-%d", $4); print$0}' input.txt

and redirect the output to new file and ran the same for the 7th column but the result is not as expected and I got the below并将输出重定向到新文件并为第 7 列运行相同的结果,但结果与预期不同,我得到了以下结果

 xyz|name1|address1|1960-02-21|M|country1|1970-08-22|etc1
    xyz|name2|address2|1961-03-21|M|country2|1970-08-22|etc1
    xyz|name3|address3|1952-02-17|M|country3|1970-01-01|etc1
    xyz|name4|address4|1961-11-11|M|country4|1970-01-01|etc1

I couldn't understand why the column 7 output is different?我不明白为什么第 7 列的输出不同? Any advise on what is wrong here?关于这里有什么问题的任何建议?

Could you please try following.你能不能试试以下。 Written and tested with shown samples in GNU awk .使用 GNU awk显示的示例编写和测试。

awk '
BEGIN{
  FS=OFS="|"
}
{
  $7=($7!=""?substr($7,1,4)"-"substr($7,5,2)"-"substr($7,7):"")
}
1' Input_file

Explanation: Adding detailed explanation for above.说明:为以上添加详细说明。

awk '                                         ##Starting awk program from here.
BEGIN{                                        ##Starting BEGIN section from here.
  FS=OFS="|"                                  ##Starting field separator and output field separator as | here.
}
{
  $7=($7!=""?substr($7,1,4)"-"substr($7,5,2)"-"substr($7,7):"")  ##Checking condition if 7th field is NOT NULL then using sub string to make them into the date exact format.
}
1                                             ##1 will print current edited/non-edited line here.
' Input_file                                  ##Mentioning Input_file name here.

NOTE: In case you have one or multiple columns which have dates and you want to change them to yyyy-mm-dd format etc then you could use a for loop like for (i=1;i<=NF;i++){if($i~/^[0-9]{8}$/){substr(..code above)...}注意:如果您有一个或多个包含日期的列,并且您想将它们更改为yyyy-mm-dd格式等,那么您可以使用 for 循环,例如for (i=1;i<=NF;i++){if($i~/^[0-9]{8}$/){substr(..code above)...}

You could use for this:您可以为此使用

$ sed -E 's/([0-9]{4})([0-9]{2})([0-9]{2})/\1-\2\-\3/g' input.txt
xyz|name1|address1|1960-02-21|M|country1|2020-01-29|etc1
xyz|name2|address2|1961-03-21|M|country2|2020-01-18|etc1
xyz|name3|address3|1952-02-17|M|country3||etc1
xyz|name4|address4|1961-11-11|M|country4||etc1

This is not an answer, but an extended comment.这不是答案,而是扩展评论。

@thanasisp's comment is on the mark. @thanasisp 的评论是正确的。 @RavinderSingh13's answer shows how to split apart and then join the date pieces into the desired format. @RavinderSingh13 的回答显示了如何拆分然后将日期片段合并为所需的格式。 If you want to use the time functions you still need to do that:如果你想使用时间函数,你仍然需要这样做:

# reformatdate.awk

BEGIN {FS = OFS = "|"}

function formatDate(d,    t) {
  t = mktime(substr(d,1,4) " " substr(d,5,2) " " substr(d,7,2) " 0 0 0")
  return strftime("%Y-%m-%d", t)
}

{
  $4 = formatDate($4)
  if ($7) $7 = formatDate($7)
  print
}

Then然后

$ gawk -f reformatdate.awk input.txt
xyz|name1|address1|1960-02-21|M|country1|2020-01-29|etc1
xyz|name2|address2|1961-03-21|M|country2|2020-01-18|etc1
xyz|name3|address3|1952-02-17|M|country3||etc1
xyz|name4|address4|1961-11-11|M|country4||etc1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM