简体   繁体   English

从R中的字符串中删除某些字符

[英]Removing certain characters from a string in R

I have a string in R which contains a large amount of words. 我在R中有一个包含大量单词的字符串。 When viewing the string I get a large amount of text which includes text similar to the following: 查看字符串时,我收到大量文本,其中包含类似于以下内容的文本:

>docs

....

\u009cYes yes for ever for ever the boys cried in their ringing voices with softened faces

....

So I'm wondering how to remove these \\u009 characters (all of them, some of which have slightly different numbers) from the string. 所以我想知道如何从字符串中删除这些\\ u009字符(所有字符,其中一些字符略有不同)。 I've tried using gsub() , but that wasn't effective in removing the content from the strings. 我尝试过使用gsub() ,但这对从字符串中删除内容无效。

This should work 这应该工作

gsub('\u009c','','\u009cYes yes for ever for ever the boys ')
"Yes yes for ever for ever the boys "

Here 009c is the hexadecimal number of unicode. 这里009c是unicode的十六进制数。 You must always specify 4 hexadecimal digits. 您必须始终指定4个十六进制数字。 If you have many , one solution is to separate them by a pipe: 如果你有很多,一个解决方案是通过管道将它们分开:

gsub('\u009c|\u00F0','','\u009cYes yes \u00F0for ever for ever the boys and the girls')

"Yes yes for ever for ever the boys and the girls"

尝试: gsub('\\\\$', '', '$5.00$')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM