Normally for generating url slug I use https://github.com/jprichardson/string.js library - and exactly slugify
method. However it removes all chinese characters. As a workaround I use following function:
var slugify = function(str){
str = str.replace(/\s+/g,'-') // replace spaces with dashes
str = encodeURIComponent(str) // encode (it encodes chinese characters)
return str
}
So for input中文 标题
I get %E4%B8%AD%E6%96%87-%E6%A0%87%E9%A2%98
and it looks like this in web browser url input box (and it works):
http://example.com/中文-标题
However I want to also remove any special characters like !@#$%^&*)
etc. The problem is that string.js
library is using following piece of code internally:
.replace(/[^\w\s-]/g
And it removes any special characters, BUT ALSO removes chinese characters as they don't match with \\w
regexp...
So my question is - how to modify above regexp so make it keep chinese characters?
I tried
replace(/[^a-zA-Z0-9_\s-\u3400-\u9FBF]/g,'')
But it still replaces chinese characters...
If you want to match (or exclude) the dash -
character in a set of characters (with square brackets), you have to put it in the end.
Your regexp matches characters that are not
az
AZ
0-9
_
\\s-\㐀
that's your problem-
\龿
You want to do:
replace(/[^a-zA-Z0-9_\u3400-\u9FBF\s-]/g,'')
do a positive match list:
replace(/[\!@#\$%^&\*\)]/g,'')
Anyway I would consider to take URL meta chars out of that:
replace(/[\!@\$\^\*\)]/g,'')
You can try uslug , which slugify汉语/漢語
to汉语漢語
If you want to transform Chinese characters to Pinyin, try transliteration
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.