简体   繁体   中英

sed - remove line break if line does not end on \"

I have a tsv.-file and there are some lines which do not end with an '"'. So now I would like to remove every line break which is not directly after an '"'. How could I accomplish that with sed? Or any other bash shell program...

Kind regards, Snafu

This sed command should do it:

sed '/"$/!{N;s/\n//}' file

It says: on every line not matching "$ do:

  • read next line, append it to pattern space;
  • remove linebreak between the two lines.

Example:

$  cat file.txt
"test"
"qwe
rty"
foo
$  sed '/"$/!{N;s/\n//}' file.txt
"test"
"qwerty"
foo

To elaborate on @Lev's answer, the BSD (OSX) version of sed is less forgiving about the command syntax within the curly braces -- the semicolon command separator is required for both commands:

sed '/"$/!{N;s/\n//;}' file.txt

per the documentation here -- an excerpt:

Following an address or address range, sed accepts curly braces '{...}' so several commands may be applied to that line or to the lines matched by the address range. On the command line, semicolons ';' separate each instruction and must precede the closing brace.

give this awk one-liner a try:

awk '{printf "%s%s",$0,(/"$/?"\n":"")}' file

test

kent$  cat f
"foo"
"bar"
"a long
text with
many many
lines"
"lalala"

kent$  awk '{printf "%s%s",$0,(/"$/?"\n":"")}' f
"foo"
"bar"
"a longtext withmany manylines"
"lalala"

This might work for you (GNU sed):

sed ':a;/"$/!{N;s/\n//;ta}' file

This checks if the last character of the pattern space is a " and if not appends another line, removes a newline and repeats until the condition is met or the end-of-file is encountered.

An alternative is:

sed -r ':a;N;s/([^"])\n/\1/;ta;P;D' file

The mechanism is left for the reader to ponder.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM