简体   繁体   English

使用 miller 时,是否可以将 CSV 的多列重命名为空列名?

[英]Is it possible to rename multiple columns of a CSV to empty columns name when using miller?

I have CSV files with headers like this我有带有这样标题的 CSV 文件

MyFirstCol,MySecondCol,MyThirdCol,.....MyLastRealCol,ppp,qqq,rrr

The columns ppp , qqq , etc I want to set to columns with empty headers.我想将pppqqq等列设置为带有空标题的列。 (I do not want to delete them!) So I want a resulting CSV with a header like this: (我不想删除它们!)所以我想要一个带有这样标题的结果 CSV:

MyFirstCol,MySecondCol,MyThirdCol,.....MyLastRealCol,,,

(Note the empty, but present columns at the end.) (注意最后是空的但存在的列。)

Is there a way to do this with miller ?(*) I tried有没有办法用米勒做到这一点?(*)我试过了

mlr --csv rename -r '"^(.){3}$",' myFile.csv

but this command folds all the matching columns into one !但是这个命令将所有匹配的列折叠成一列 :-( :-(


(*) I do know how to hack this together with a search-replace command in sed , but I don't like it as a general solution, because sed is not aware of the CSV's column structure. (*) 我知道如何在sed使用 search-replace 命令来破解它,但我不喜欢它作为通用解决方案,因为sed不知道 CSV 的列结构。 Therefore I am hoping for a solution with miller.因此,我希望与米勒一起解决。

If I understand correctly, just remove the empty columns如果我理解正确,只需删除空列

mlr --csv remove-empty-columns input.csv >output.csv

If you want to use rename, the command is如果要使用重命名,命令是

mlr --csv rename -r '^.{3}$,' input.csv >output.csv

But please note in Miller you cannot have a CSV whit two or more fields with the same name.请注意,在 Miller 中,您不能拥有包含两个或多个同名字段的 CSV。 And if you have如果你有

MyFirstCol,MySecondCol,MyThirdCol,.....MyLastRealCol,,,

the last fields have the same empty field name.最后一个字段具有相同的空字段名称。 Then you can add a numeric progressive heading, then apply a search & replace to the first data row, and at the end remove the numeric heading.然后您可以添加一个数字渐进式标题,然后对第一个数据行应用搜索和替换,最后删除数字标题。

Starting from从...开始

field1,field2,ppp,qqq,zzz
1,2,,,
4,7,,,

and running和跑步

mlr --csv -N put -S 'if(NR==1){for (k in $*) {$[k] = gsub($[k], "^.{3}$", "");}}' input.csv

you will have你将会拥有

field1,field2,,,
1,2,,,
4,7,,,

Some points:几点:

  • -N add and remove the numeric heading; -N添加和删​​除数字标题;
  • if(NR==1) to apply the put verb only to first data row that here is field1,field2,ppp,qqq,zzz if(NR==1)将 put 动词仅应用于此处是field1,field2,ppp,qqq,zzz第一个数据行

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM