[英]How to remove double quotes(") and new lines in between ," and ", in a unix file
我得到一個逗號分隔的文件,字符串和日期字段帶有雙引號。 我們在字符串列中得到 " 和新的換行符,如下所示。
"1234","asdf","with"doublequotes","new line
feed","withmultiple""doublequotes"
想要 output 喜歡
"1234","asdf","withdoublequotes","new linefeed","withmultipledoublequotes"
我努力了
sed 's/\([^",]\)"\([^",]\)/\1\2/g;s/\([^",]\)""/\1"/g;s/""\([^",]\)/"\1/g' < infile > outfile
它刪除字符串中的雙引號並刪除最后一個雙引號,如下所示
"1234","asdf","withdoublequotes","new line
feed","withmultiple"doublequotes
有沒有辦法刪除 " 和新的換行符在 ", and," 之間
你可以試試rquery( https://github.com/fuyuncat/rquery ),內置函數很方便。
[ rquery]$ cat mess.cvs
"1234","asdf","with"doublequotes","new line
feed","withmultiple""doublequotes"
[ rquery]$ ./rq -q "p /^\"([^\"]*)\",\"([^,]*)\",\"([^,]*)\",\"([^,]*)\",\"([^,]*)\"/ |s '\"'+replace(regreplace(@1,'\n',''),'\"','')+'\",\"'+replace(regreplace(@2,'\n',''),'\"','')+'\",\"'+replace(regreplace(@3,'\n',''),'\"','')+'\",\"'+replace(regreplace(@4,'\n',''),'\"','')+'\",\"'+replace(regreplace(@5,'\n',''),'\"','')+'\"'" mess.cvs
"1234","asdf","withdoublequotes","new line feed","withmultipledoublequotes"
您對兩個連續引號的替換不起作用,因為它們被放置在替換唯一引號之后,而此時只剩下兩個引號中的一個。
我們可以通過重復替換來刪除 " (否則替換插入的引號會保留),如果當前的結尾不是引號,則可以通過加入下一個輸入行來換行:
sed ':1;/[^"]$/{;N;s/\n//;b1;};:0;s/\([^,]\)"\([^,]\)/\1\2/g;t0' <infile >outfile
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.