简体   繁体   中英

Is it possible to rename multiple columns of a CSV to empty columns name when using miller?

I have CSV files with headers like this

MyFirstCol,MySecondCol,MyThirdCol,.....MyLastRealCol,ppp,qqq,rrr

The columns ppp , qqq , etc I want to set to columns with empty headers. (I do not want to delete them!) So I want a resulting CSV with a header like this:

MyFirstCol,MySecondCol,MyThirdCol,.....MyLastRealCol,,,

(Note the empty, but present columns at the end.)

Is there a way to do this with miller ?(*) I tried

mlr --csv rename -r '"^(.){3}$",' myFile.csv

but this command folds all the matching columns into one ! :-(


(*) I do know how to hack this together with a search-replace command in sed , but I don't like it as a general solution, because sed is not aware of the CSV's column structure. Therefore I am hoping for a solution with miller.

If I understand correctly, just remove the empty columns

mlr --csv remove-empty-columns input.csv >output.csv

If you want to use rename, the command is

mlr --csv rename -r '^.{3}$,' input.csv >output.csv

But please note in Miller you cannot have a CSV whit two or more fields with the same name. And if you have

MyFirstCol,MySecondCol,MyThirdCol,.....MyLastRealCol,,,

the last fields have the same empty field name. Then you can add a numeric progressive heading, then apply a search & replace to the first data row, and at the end remove the numeric heading.

Starting from

field1,field2,ppp,qqq,zzz
1,2,,,
4,7,,,

and running

mlr --csv -N put -S 'if(NR==1){for (k in $*) {$[k] = gsub($[k], "^.{3}$", "");}}' input.csv

you will have

field1,field2,,,
1,2,,,
4,7,,,

Some points:

  • -N add and remove the numeric heading;
  • if(NR==1) to apply the put verb only to first data row that here is field1,field2,ppp,qqq,zzz

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM