[英]Javascript regex for splitting by whitespace for accented chars
I am trying to split string in javascript by whitespaces, but ignoring whitespaces enclosed in quotes. 我试图通过空格分割javascript中的字符串,但忽略引号括起来的空格。 So I googled this regular expression :
(/\\w+|"[^"]+"/g)
but the problem is, that this isn't working with accented chars like á etc. So please how should I improve my regular expression to make it work? 所以我用谷歌搜索了这个正则表达式:(
(/\\w+|"[^"]+"/g)
但问题是,这不适用于á等重音字符。所以请问我应该如何改进我的正则表达式让它起作用?
That's because \\w
only matches [A-Za-z0-9_]
. 那是因为
\\w
只匹配[A-Za-z0-9_]
。 To match accented characters, add the unicode block range \\x81-\\xFF
which includes the Latin-1 characters à
and ã
, et cetera : 要匹配重音字符,请添加unicode块范围
\\x81-\\xFF
,其中包括Latin-1字符à
和ã
, 等等 :
(/[\w\x81-\xFF]+|"[^"]+"/g)
There's also this site, which is very helpful to build the required unicode block range. 还有这个站点,这对构建所需的unicode块范围非常有帮助。
这匹配不包含引号的非空格,并匹配引号之间的文本:
/[^\s"]+|"[^"]+"/g
如果要匹配所有非空白字符而不是仅匹配字母数字字符,请将\\w
替换为\\S
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.