简体   繁体   English

如何在正则表达式中匹配非ASCII(德语,西班牙语等)字母?

[英]How to match non-ASCII (German, Spanish, etc.) letters in regex?

I was unable to find or create a regex which match only letters,spaces, accented letters and spanish and german letters. 我无法找到或创建一个只匹配字母,空格,重音字母和西班牙语和德语字母的正则表达式。

I'm using this for now: 我现在正在使用它:

var reg = new RegExp("^[a-z _]*$");

I've tried: 我试过了:

^[:alpha: _]*$   
^[a-zA-Z0-9äöüÄÖÜ]*$  
^[-\p{L}]*$   

Any idea? 任何的想法? Or the regex supported by javascript engines are limited? 或者javascript引擎支持的正则表达式是有限的?

The 2nd to last case looks like it should work, but is missing a " " and "_": 第二个到最后一个案例看起来应该可以工作,但缺少一个“”和“_”:

/^[a-zA-Z0-9äöüÄÖÜ]*$/.test("aäöüÄÖÜz") => true in FF 3.6 and IE8 /^[a-zA-Z0-9äöüÄÖÜ]*$/.test("aäöüÄÖÜz") =>在FF 3.6和IE8中为真

/^[a-zA-Z0-9äöüÄÖÜ]*$/.test("é") => false in FF 3.6 and IE8 FF + 3.6和IE8中的/ /^[a-zA-Z0-9äöüÄÖÜ]*$/.test("é") => false

I'm am unable to find the other constructs in the ECMAScript specification . 我无法在ECMAScript规范中找到其他构造。

Happy coding. 快乐的编码。

Edit Also check the page encoding and make sure it is "unicode" (UTF-8 likely). 编辑同时检查页面编码并确保它是“unicode”(可能是UTF-8)。 If this can't be ensured, then use the \\uXXXX escape sequences in the regular expression (using the escapes can be done anyway and may help with source code editing/control). 如果无法确保这一点,那么在正则表达式中使用\\uXXXX转义序列(无论如何都可以使用转义,并可能有助于源代码编辑/控制)。

I'm parsing a name input field, and this seems to be working for both German and French: 我正在解析名称输入字段,这似乎适用于德语和法语:

^[a-zA-Z\\-ÀàÂâÆæÇçÈèÉéÊêËëÎîÏïÔôŒœÙùÛûÜü]*$

Some folks have names like 'Rölf-Dieter', and this lets them through, while checking for numbers. 有些人的名字就像'Rölf-Dieter',这可以让他们通过,同时检查数字。 A little extreme, but it works! 有点极端,但它的确有效!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM