简体   繁体   English

法语名称的Java正则表达式

[英]Java regular expression for French names

I need to modify regular expression to allow all standard characters, French characters, spaces AND dash (hyphen) but only one at a time. 我需要修改正则表达式以允许所有标准字符,法语字符,空格和破折号(连字符),但一次只允许一个。

What I have right now is: 我现在拥有的是:

import java.util.regex.Pattern;

public class FrenchRegEx {

    static final String NAME_PATTERN = "[\u00C0-\u017Fa-zA-Z-' ]+";

    public static void main(String[] args) {

        String name;

        //name = "Jean Luc"; // allowed
        //name = "Jean-Luc"; // allowed
        //name = "Jean-Luc-Marie"; // allowed
        name = "Jean--Luc"; // NOT allowed

        if (!Pattern.matches(NAME_PATTERN, name)) {
            System.out.println("ERROR!");
        } else System.out.println("OK!");
    }
}

and it allows 'Jean--Luc' as a name and that is not allowed. 它允许'让 - 吕克'作为名称,这是不允许的。

Any help with this? 对此有何帮助? Thanks. 谢谢。

So, you want a pattern which is a 0 or more hyphens, separated by 1 or more other characters. 因此,您需要一个0或更多连字符的模式,由1个或多个其他字符分隔。 It's just a matter of writing the pattern that way: 这只是编写模式的问题:

"[\u00C0-\u017Fa-zA-Z']+([- ][\u00C0-\u017Fa-zA-Z']+)*"

This also assumes you don't want names to start or end with a hyphen or space, nor that you want more than one space in a row, and that you also want to disallow a space to follow or proceed a hyphen. 这也假设您不希望名称以连字符或空格开头或结尾,也不希望您想要一行中有多个空格,并且您还希望禁止使用空格或继续连字符。

You need to disallow consecutive hyphens. 您需要禁止连续连字符。 You may do it with a negative lookahead: 你可以用负面的预测来做到这一点:

static final String NAME_PATTERN = "(?!.*--)[\u00C0-\u017Fa-zA-Z-' ]+";
                                    ^^^^^^^^

To disallow any of the special chars to be consecutive, use 要禁止任何特殊字符连续使用,请使用

static final String NAME_PATTERN = "(?!.*([-' ])\\1)[\u00C0-\u017Fa-zA-Z-' ]+";

Another way is to unroll the pattern a bit to match strings where the special char(s) can appear in between letters, but cannot appear consecutively (ie if you need to match Abc-def'here like strings): 另一种方法是将模式展开一点以匹配字符串,其中特殊字符可以出现在字母之间,但不能连续出现(即如果你需要像字符串那样匹配Abc-def'here ):

static final String NAME_PATTERN = "[\u00C0-\u017Fa-zA-Z]+(?:[-' ][\u00C0-\u017Fa-zA-Z]+)*";

or to only allow 1 special char that can only appear in between letters (ie if you nee to only allow strings like abc-def , or abc'def ): 或者只允许1个特殊字符只能出现在字母之间(例如,如果你只需要允许字符串,如abc-defabc'def ):

static final String NAME_PATTERN = "[\u00C0-\u017Fa-zA-Z]+(?:[-' ][\u00C0-\u017Fa-zA-Z]+)?";

Note that you do not need anchors here because you are using the pattern inside a .matches() method that requires a full string match. 请注意,此处不需要锚点,因为您在.matches()方法中使用了需要完整字符串匹配的模式。

NOTE: you may further tune the patterns by moving special chars that may appear anywhere in the string from the [-' ] character class to the [\À-\ſa-zA-Z] character classes, like [\À-\ſa-zA-Z] , but watch out for - . 注意:您可以通过移动可能出现在字符串中从[-' ]字符类到[\À-\ſa-zA-Z]字符类的任何位置的特殊字符来进一步调整模式,例如[\À-\ſa-zA-Z] ,但请注意- It should be placed at the end, near ] . 它应该放在最后,靠近]

Try using ([\À-\ſa-zA-Z']+[- ]?)+ . 尝试使用([\À-\ſa-zA-Z']+[- ]?)+ This would match one or more names separated by exactly one dash or space. 这将匹配由一个短划线或空格分隔的一个或多个名称。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM