简体   繁体   English

正则表达式匹配Java

[英]Regex matching Java

I have a problem regex matching an upper case letter possibly followed by a lower case letter. 我有一个正则表达式匹配大写字母可能跟小写字母的问题。 I want to break after any such matches, but I just can't seem to get it to work. 我想在任何这样的比赛后休息,但我似乎无法使其正常工作。

To make it more general - I want to split before and after any matches in regex. 为了使它更通用-我想在正则表达式中的任何匹配之前和之后进行拆分。

Example string "TeSTString" 示例string "TeSTString"

Wanted result -> [Te, S, T, St, ring] 想要的结果-> [Te, S, T, St, ring]

I have tried anything I can think of, but I'm getting tricked by look-ahead or behind. 我已经尝试了所有我能想到的东西,但是我被前瞻性或落后性所欺骗。

First I tried [AZ][az]? 首先我尝试了[AZ][az]? , and that matches perfect, but removes it... ,并且匹配完美,但将其删除...

result -> [ring] 结果-> [ring]

after this I did positive look-ahead (?=([AZ][az]?)) giving me something close... 在此之后,我进行了积极的前瞻(?=([AZ][az]?))

result -> [Te, S, T, String] 结果-> [Te, S, T, String]

and look-behind (<=?([AZ][az]?)) giving nothing at all... 并向后看(<=?([AZ][az]?))

result -> [TeSTString] 结果-> [TeSTString]

even tried reversing the look-behind (<=?([az]?[AZ])) , in a desperate attempt, but this was fairly unsuccessful. 甚至试图进行一次绝望的反转(<=?([az]?[AZ])) ,但这是不成功的。

Can anyone give a good pointer in the right direction before I lose my mind? 在我迷失方向之前,谁能向正确的方向指点好方向?

Here's one convoluted pattern that will match the expected result. 这是一种与预期结果相符的复杂模式。

String test = "TeSTStringOne";
System.out.println(
    Arrays.toString(
        //          | preceded by lowercase
        //          |        | followed by uppercase
        //          |        |       | or
        //          |        |       || preceded and followed by uppercase
        //          |        |       ||                  | or
        //          |        |       ||                  || preceded by uc
        //          |        |       ||                  || AND lowercase
        test.split("(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z])|(?<=[A-Z][a-z])")
    )
);

Output 输出量

[Te, S, T, St, ring, On, e]

Note 注意

Replace [az] with \\\\p{Ll} and [AZ] with \\\\p{Lu} to use with accented letters. \\\\p{Ll}替换[az] ,用\\\\p{Lu}替换[az] [AZ]以使用带重音的字母。

Try with: 尝试:

(?<=[A-Z][a-z])|(?=(?<!^)[A-Z])

DEMO 演示

  • (?<=[AZ][az]) = positive lookbehind for upper case followed by lower case, (?<=[AZ][az]) =大写字母后跟小写字母的正向后看,
  • (?=(?<!^)[AZ]) - positive lookahead for upper case, if not preceded by beginnig of a line, (?=(?<!^)[AZ]) -大写的正向查找,如果前面没有一行的beginnig,

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM