簡體   English   中英

如何在 unix 文件中刪除雙引號(“)和”和“之間的新行

[英]How to remove double quotes(") and new lines in between ," and ", in a unix file

我得到一個逗號分隔的文件,字符串和日期字段帶有雙引號。 我們在字符串列中得到 " 和新的換行符,如下所示。

"1234","asdf","with"doublequotes","new line
feed","withmultiple""doublequotes"

想要 output 喜歡

"1234","asdf","withdoublequotes","new linefeed","withmultipledoublequotes"

我努力了

sed 's/\([^",]\)"\([^",]\)/\1\2/g;s/\([^",]\)""/\1"/g;s/""\([^",]\)/"\1/g' < infile > outfile

它刪除字符串中的雙引號並刪除最后一個雙引號,如下所示

"1234","asdf","withdoublequotes","new line
feed","withmultiple"doublequotes

有沒有辦法刪除 " 和新的換行符在 ", and," 之間

你可以試試rquery( https://github.com/fuyuncat/rquery ),內置函數很方便。

[ rquery]$ cat mess.cvs
"1234","asdf","with"doublequotes","new line
feed","withmultiple""doublequotes"
[ rquery]$ ./rq -q "p /^\"([^\"]*)\",\"([^,]*)\",\"([^,]*)\",\"([^,]*)\",\"([^,]*)\"/ |s '\"'+replace(regreplace(@1,'\n',''),'\"','')+'\",\"'+replace(regreplace(@2,'\n',''),'\"','')+'\",\"'+replace(regreplace(@3,'\n',''),'\"','')+'\",\"'+replace(regreplace(@4,'\n',''),'\"','')+'\",\"'+replace(regreplace(@5,'\n',''),'\"','')+'\"'" mess.cvs
"1234","asdf","withdoublequotes","new line feed","withmultipledoublequotes"

您對兩個連續引號的替換不起作用,因為它們被放置在替換唯一引號之后,而此時只剩下兩個引號中的一個。

我們可以通過重復替換來刪除 " (否則替換插入的引號會保留),如果當前的結尾不是引號,則可以通過加入下一個輸入行來行:

sed ':1;/[^"]$/{;N;s/\n//;b1;};:0;s/\([^,]\)"\([^,]\)/\1\2/g;t0' <infile >outfile

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM