I need a some help to replace all non-word characters in a string.
As an example (stadtbezirkspräsident'
should become stadtbezirkspräsident
.
This Regex should work for all languages so it's kind of tricky because I have no idea how to match characters like ñ
or œ
. I tried solving this with
string.replace(/[&\/\\#,+()$~%.'":*?<>-_{}]/g,' ');
but ther are still to many special characters like Ø
left.
Perhaps there is a general Selector for this, or anybody has solved this problem before?
尝试使用技巧
str.replace(/(?!\w)[\x00-\xC0]/g, '')
If you have define all the Unicode ranges yourself, it's going to be a lot of work.
It might make more sense to use Steven Levithan's XRexExp
package with Unicode add-ons and utilize its Unicode property shortcuts:
var regex = new XRegExp("\\P{L}+", "g")
string = XRegExp.replace(string, regex, "")
This is more of a comment to Tim Pietzcker's answer, but presenting code in comments is awkward... Here's a simple example of using the XRexExp package:
<p id=orig>Bundespräsident / ß+ð/ə¿α!</p>
<p id=new></p>
<script src="http://cdnjs.cloudflare.com/ajax/libs/xregexp/2.0.0/xregexp-min.js">
</script>
<script src="http://xregexp.com/addons/unicode/unicode-base.js">
</script>
<script>
var regex = new XRegExp("\\P{L}+", "g");
var string = document.getElementById('orig').innerHTML;
string = XRegExp.replace(string, regex, "");
document.getElementById('new').innerHTML = string;
</script>
For production use, you would probably want to download some versions of the base package and the Unicode plug-in and use them on your server.
Note: The code checks for characters that are not classified as letters (alphabetic) in Unicode. I suppose this corresponds to what you mean by “word character”, though words in a natural language may contain hyphens, apostrophes, and other non-letters.
Beware that characters are added to Unicode, and the category of a character might (rarely) change. The package has been maintained well, though; it corresponds to Unicode 6.1 (version 6.2 is out, but it has no new letters).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.