简体   繁体   中英

Split by paragraph in R

I'm trying to split a document by paragraph in R

test.text <- c("First paragraph.  Second sentence of 1st paragraph.

           Second paragraph.")
# When we run the below, we see separation of \n\n between the 2nd and 3rd sentences
test.text

# This outputs the desired 2 blank lines in the console
writeLines("\n\n")

a <- strsplit(test.text, "\\n\\n")

It's not splitting properly.

The output of strsplit is a list . Also, there are spaces after the \\n\\n . So, we need to take care of that as well as convert it to a vector using [[ or by unlist ing

a <- strsplit(test.text, "\n+\\s+")[[1]]
a
#[1] "First paragraph.  Second sentence of 1st paragraph." "Second paragraph."        

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM