简体   繁体   English

Java正则表达式 - 多行

[英]Java regular expression - multiline

I have a large array with string. 我有一个带字符串的大数组。 I need to use the string in the array to form patterns. 我需要使用数组中的字符串来形成模式。 However for the string in the text could be across several lines. 但是,对于文本中的字符串可能跨越几行。 the constructed patterns don't work with the multiline flag. 构造的模式不适用于多行标志。 Could anyone point out what is wrong? 谁能指出出了什么问题? Thank you. 谢谢。

Here is my code: 这是我的代码:

String[] phrases = new String[2];
    phrases[0] = "student (male)";
    phrases[1] = "worker (female)";

    Pattern[] ptn = new Pattern[phrases.length];

    int i = 0;
    for (String p : phrases)
    {
        p = Pattern.quote(p);
        System.out.println(p);
        ptn[i] = Pattern.compile(p+"\\:\\s\\w+",Pattern.MULTILINE);
        i++;
    }

    String text = "student\n(male): John";
    System.out.println(text);

    for(Pattern p : ptn)
    {
        Matcher m = p.matcher(text);
        while(m.find())
        {
            System.out.println(m.group());
        }
    }

Here, you don't need that MULTILINE flag: 在这里,您不需要MULTILINE标志:

As @fge explained earlier, that flag only means that ^ (and $ ) will match the begin (and end ) of each line in the tested String. 正如@fge之前解释的那样,该标志仅表示^ (和$ )将匹配测试的String中行的开始 (和结束 )。
Reminder: the default behavior (without that flag) would cause ^ and $ to match respectively the begin and the end of the whole String . 提醒:默认行为(没有该标志)会导致^$分别匹配整个String的开头和结尾。


If you want to match, at some point, either a space or a new line , I would suggest you to try matching \\s . 如果你想在某个时候匹配一个空格或一个新行 ,我建议你尝试匹配\\s

However, if you replace the following lines: 但是,如果您替换以下行:

phrases[0] = "student (male)";
phrases[1] = "worker (female)";

by: 通过:

phrases[0] = "student\\s(male)";
phrases[1] = "worker\\s(female)";

Then you won't be able to use Pattern#quote to escape the parenthesis. 然后,您将无法使用Pattern#quote转义括号。 I believe that the simplest way is to directly escape them yourself as follow: 我相信最简单的方法是直接逃脱它们,如下所示:

phrases[0] = "student\\s\\(male\\)";
phrases[1] = "worker\\s\\(female\\)";

If you actually can't modify these Strings, you may just change: 如果您实际上无法修改这些字符串,您可能只需更改:

p = Pattern.quote(p);

for: 对于:

p = p.replaceAll("(\\(|\\))", "\\\\"+"$1").replaceAll(" ", "\\\\s");

This will: 这将:

  • escape the ( and ) 逃避()
  • replace the spaces ( 更换空间( ) by \\s to match either spaces or new lines. \\s匹配空格或新行。

Here is an Ideone link to some executable example of how your code could be :) 这是一个Ideone链接到一些代码可能的可执行示例:)

Hope it helps! 希望能帮助到你!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM