...,"My quote goes on
to multiple lines
like this",...
How would I catch this in a regular expression? I want to do this in a substitution to end up with
....,"My quote goes on to multiple lines like this",...
I tried
"(?<!\")\r\n(?!\")"
This was in an attempt to find a newline that does NOT end with a quote, and the next line does not start with a quote either.
The following substitution was done in R using that regular expression with no luck...
newDF = gsub( "(?<!\")\r\n(?!\")", " ", newDF, perl = TRUE)
You can match a quoted substring and then use gsubfn to replace linebreaks inside the quoted substrings only:
library(gsubfn)
s = "...,\"My quote goes on\r\nto multiple lines\r\nlike this\",..."
gsubfn("\"[^\"]+\"", function(x) gsub("(?:\r?\n)+", " ", x), s)
[1] "...,\"My quote goes on to multiple lines like this\",..."
The "[^"]+"
pattern matches all quoted substrings, and then (?:\\r?\\n)+
matches 1 or more sequences of an optional CR ( \\r?
) followed with 1 LF (that are replaced with a space).
Alternatively, you can achieve a similar result with a PCRE regex like
gsub("(?:\r?\n)+(?!(?:[^\"]|\"[^\"]*\")*$)", " ", s, perl=T)
[1] "...,\"My quote goes on to multiple lines like this\",..."
See the regex demo . The (?!(?:[^\\"]|\\"[^\\"]*\\")*$)
lookahead makes sure there are no even quotes up to the string end.
> x <- "My quote goes on
+ to multiple lines
+ like this"
> gsub("\\n", " ", x)
[1] "My quote goes on to multiple lines like this"
Don't forget to double the backslashes.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.