简体   繁体   中英

R: remove hidden line break characters from text strings within a data frame

I have discovered that some strings within my data frame contain hidden line break characters, though I can't tell exactly which (when loaded into gVim they simply show up as line breaks). The following code:

gsub("[\\r\\n]", "", x)

successfully removes the line breaks from within the strings. However, it also removes the line breaks separating the cells, making my data frame atomic instead of recursive. How can I target only the line breaks within the strings while keeping my data frame intact?

Here's a sample of the data:

sample data frame

copying the comments above to close the question,

dataframe <- data.frame(ID = 1:2, Name = 'XX',
  string_column = c('Hi \r\nyou\r\n', 'Always \r\nshare\r\n some \r\nsample\r\n data!'))
  dataframe$string_column  
#> [1] Hi \r\nyou\r\n                                
#> [2] Always \r\nshare\r\n some \r\nsample\r\n data!
#> Levels: Always \r\nshare\r\n some \r\nsample\r\n data! Hi \r\nyou\r\n

dataframe$string_column <- sapply(dataframe$string_column,
                                    function(x) { gsub("[\r\n]", "", x) })
dataframe$string_column
#> [1] "Hi you"                         "Always share some sample data!"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM