简体   繁体   English

R:从数据框中的文本字符串中删除隐藏的换行符

[英]R: remove hidden line break characters from text strings within a data frame

I have discovered that some strings within my data frame contain hidden line break characters, though I can't tell exactly which (when loaded into gVim they simply show up as line breaks). 我发现数据框内的某些字符串包含隐藏的换行符,尽管我无法确切分辨出哪一个(当装入gVim时,它们只是显示为换行符)。 The following code: 如下代码:

gsub("[\\r\\n]", "", x) gsub(“ [\\ r \\ n]”,“”,x)

successfully removes the line breaks from within the strings. 成功删除了字符串中的换行符。 However, it also removes the line breaks separating the cells, making my data frame atomic instead of recursive. 但是,它也消除了分隔单元格的换行符,使我的数据帧具有原子性,而不是递归的。 How can I target only the line breaks within the strings while keeping my data frame intact? 在保持数据框完整的同时,如何只定位字符串中的换行符?

Here's a sample of the data: 这是数据示例:

sample data frame 样本数据框

copying the comments above to close the question, 复制上面的评论以结束问题,

dataframe <- data.frame(ID = 1:2, Name = 'XX',
  string_column = c('Hi \r\nyou\r\n', 'Always \r\nshare\r\n some \r\nsample\r\n data!'))
  dataframe$string_column  
#> [1] Hi \r\nyou\r\n                                
#> [2] Always \r\nshare\r\n some \r\nsample\r\n data!
#> Levels: Always \r\nshare\r\n some \r\nsample\r\n data! Hi \r\nyou\r\n

dataframe$string_column <- sapply(dataframe$string_column,
                                    function(x) { gsub("[\r\n]", "", x) })
dataframe$string_column
#> [1] "Hi you"                         "Always share some sample data!"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM