I am trying to get this ouput, i don't know how to get it i search through the internet but i didn't know what will be the exact keyword for searching, so i post it here my question i have a csv file data.csv
which it contents are shown below I have tried so far is shown my MWE
cat data.csv|sed 's/\\n.*//g'
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,
line 5 text
10,1,6,"<J>
line 6 text"
10,1,7,"line 7 text"
10,1,8,"
line 8 text"
10,1,9,"line 9 text"
I want the ouput as shown below
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,"line 5 text"
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
With GNU sed:
sed '/".*"$/!{N;s/\n *//}' file
If a line does not match regex ".*"$
append next line ( N
) to sed's pattern space and replace newline followed by none, one or more white spaces with nothing ( s/\\n *//
).
Output:
10,1,1,"line 1 text" 10,1,2,"line 2 text" 10,1,3,"line 3 text" 10,1,4,"line 4 text" 10,1,5, line 5 text 10,1,6,"line 6 text" 10,1,7,"line 7 text" 10,1,8,"line 8 text" 10,1,9,"line 9 text"
I did not add the missing quotation marks in line 5.
See: man sed
and The Stack Overflow Regular Expressions FAQ
In addition to Cyrus's answer, to ensure 'line 5 text'
is surrounded with double-quotes you can add additional expressions to replace the ', '
with ',"'
and lines that do not end in '"'
with a '"'
, eg
sed -e '/".*"$/!{N;s/\n *//}' -e 's/, /,"/' -e '/"$/!{s/$/"/}' file
The first expression is exactly the same. This would provide your requested output of:
$ sed -e '/".*"$/!{N;s/\n *//}' -e 's/, /,"/' -e '/"$/!{s/$/"/}' file
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,"line 5 text"
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
With GNU awk for mult-char RS, RT, and gensub() you can just describe each record as a series of 4 comma-separated fields ending in newline and then remove the newlines and spaces around them:
$ awk -v RS='([^,]*,){3}[^,]*\n' '{$0=gensub(/\s*\n\s*/,"","g",RT)} 1' file
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,line 5 text
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
and to ensure quotes around the last field:
$ awk -v RS='([^,]*,){3}[^,]*\n' '{$0=gensub(/\s*\n\s*/,"","g",RT); $0=gensub(/,([^",]*)$/,",\"\\1\"",1)} 1' file
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,"line 5 text"
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
Note that this will work no matter how many lines your 4th field is split over:
$ cat file
10,1,1,"line 1 text"
10,1,2,
foo
line
2
text
bar
10,1,3,"line 3 text"
$ awk -v RS='([^,]*,){3}[^,]*\n' '{$0=gensub(/\s*\n\s*/,"","g",RT); $0=gensub(/,([^",]*)$/,",\"\\1\"",1)} 1' file
10,1,1,"line 1 text"
10,1,2,"fooline2textbar"
10,1,3,"line 3 text"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.