简体   繁体   English

如何使用正则表达式替换特殊字符?

[英]How to replace special characters using regular expressions?

How to replace special characters using regular expressions? 如何使用正则表达式替换特殊字符? By special, what I mean is those symbolic characters that appear sometimes in text. 特别地,我的意思是那些有时出现在文本中的符号字符。

For example, in text below, I want to remove the bubble which is at the start of each line. 例如,在下面的文本中,我要删除每行开头的气泡。

Passport Details 护照资料

Name as on passport
Relationship
Passport Number
Date of Issue
Expiry Date
Place of Issue

Question edited : Sorry, the bubble at the start of line is no more visible.After submitting question, stackoverflow removed that special character. 已编辑问题:抱歉,行开头的气泡不再可见。提交问题后,stackoverflow删除了该特殊字符。

Anyone knows how to replace those special characters? 有人知道如何替换那些特殊字符吗? I dont want to replace characters like #, @ or !. 我不想替换#,@或!等字符。 These are trivial and can be typed with keyboard. 这些都很简单,可以用键盘输入。

Sorry, I dont know how to put those special characters in my question.I will try to explain. 抱歉,我不知道如何在问题中加上这些特殊字符。我会尽力解释。 In word file, we put bullets before text. 在Word文件中,我们将项目符号放在文本之前。 I want to replace characters reprenting such characters. 我想替换代表此类字符的字符。 I have some text files which contain characters which look like bubble. 我有一些文本文件,其中包含看起来像气泡的字符。

Finally, I found out the solution. 最后,我找到了解决方案。 This regular expression works for me 这个正则表达式对我有用

([^(A-Za-z0-9)+|\\r|\\n|\\t|'|"|#|;|:|/|\\|.|,| ]) ([^(A-Za-z0-9)+ | \\ r | \\ n | \\ t |'|“ |#|; |:|||| \\ .. ,, |])

It would be possible to find all "special" characters with this regular expression and then just replace them with a space character: 可以使用此正则表达式找到所有“特殊”字符,然后将它们替换为空格字符:

/[<special_characters_here>]/

However, usually it is better to use whitelisting, thus mentioning all allowed characters and replacing everything that's not them with a space character: 但是,通常最好使用白名单,从而提及所有允许的字符,并用空格字符替换不是它们的所有内容:

/[^<allowed_characters_here>]/

(This was posted before the language had been specified.) (这是在指定语言之前发布的。)

To replace non-ascii characters with a space in Perl, 要将非ascii字符替换为Perl中的空格,

 $string =~ s/[^[:ascii:]]/ /g;

See http://codepad.org/KTMvQiOz . 参见http://codepad.org/KTMvQiOz Here the [^[:ascii:]] is a regex which matches any non-ascii character. 这里的[^[:ascii:]]是一个正则表达式,可以匹配任何非ascii字符。

Do you mean replacing the carriage return and new line characters? 您是要替换回车符和换行符吗?

If that's what you're after, this would do it: 如果那是您所追求的,那就可以做到:

var source = "once\r\ntwice\r\nthrice";
var pattern = new Regex(@"\r\n");
var result = pattern.Replace(source, ",");
Assert.AreEqual("once,twice,thrice", result);

I don't have enough time to flesh out a full example. 我没有足够的时间充实一个完整的例子。 But since you're using .NET you can match on any number of these character classes: 但是,由于您使用的是.NET,因此可以在任意数量的以下字符类上进行匹配:

http://msdn.microsoft.com/en-us/library/20bw873z.aspx http://msdn.microsoft.com/en-us/library/20bw873z.aspx

Choose what you want to accept and replace anything that is not equal to that set. 选择您要接受的内容,并替换不等于该内容的所有内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM