简体   繁体   English

这个正则表达式在Java中如何工作?

[英]how does this regex work in Java?

I have the following piece of code that splits the string and returns an array of strings. 我有以下代码用于拆分字符串并返回字符串数组。

public static void main(String[] args) {
      String name="what is going on";
      String[] ary = name.split("");
      System.out.println(Arrays.toString(ary));
       }
//output: [, w, h, a, t,  , i, s,  , g, o, i, n, g,  , o, n]  

To prevent the trailing spaces, the following regex was employed during split. 为了防止尾随空格,在拆分期间使用了以下正则表达式。 but I would like to know how it works 但我想知道它是如何工作的

public static void main(String[] args) {
          String name="what is going on";
          String[] ary = name.split("(?!^)");
          System.out.println(Arrays.toString(ary));
           } //[w, h, a, t,  , i, s,  , g, o, i, n, g,  , o, n]

if someone can explain what the regex looks for and how that regex is used for split, it will be very helpful for Java beginner community. 如果有人可以解释该正则表达式的含义以及该正则表达式如何用于拆分,那么它将对Java初学者社区非常有用。 Thanks a lot 非常感谢

In your first example, the empty pattern matches before every character in the string. 在第一个示例中,空模式在字符串中的每个字符之前匹配。 So it matches before the first character, before the second, etc. The String.split(String) Javadoc indicates that trailing empty strings are ignored, but the returned strings includes what is before the first match. 因此,它在第一个字符之前,第二个字符之前等等进行匹配String.split(String) Javadoc指示尾随的空字符串将被忽略,但是返回的字符串包括第一个匹配项之前的内容。 So, the array is {"", "w", "h", ..., "n"} . 因此,数组为{"", "w", "h", ..., "n"}

The second example has a regexp that matches any place except for the beginning of the string. 第二个示例有一个regexp,它匹配字符串开头以外的任何位置。 The (? and ) bound a lookahead. (?)限制了前瞻。 The ! ! makes it a negative lookahead and the ^ means the beginning of the string. 使其为负数,并且^表示字符串的开头。 Moreover, no characters are actually consumed by the regexp. 而且,正则表达式实际上没有消耗任何字符。 So, it matches after the first character, after the second, and so on. 因此,它在第一个字符之后,第二个字符之后匹配,依此类推。 None of the characters themselves get consumed, so you have: 没有一个字符本身被消耗掉,因此您具有:

 w h a t   i s   g o   i n g   o n
  ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

The carets here are break points with a space above. 此处的插入符号是带有空格的断点。

It splits the string to substrings and divide it on the regex char or string: BUT not puts the regex into output so: 它将字符串拆分为子字符串,然后将其分割为正则表达式char或字符串:但是不会将正则表达式放入输出中,因此:

string s1 = "divided by spaces"; 字符串s1 =“除以空格”; and s1.split("\\s")[0] will be the divided s1.split("\\s")[1] will be the by and NOT the " " 和s1.split(“ \\ s”)[0]将被除以s1.split(“ \\ s”)[1]将是by,而不是“”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM