简体   繁体   English

Java-如何编写包含给定集合变量的正则表达式的正则表达式

[英]Java - How to write a regex that includes disjunction of variables of a given set

I need to retrieve the numbers followed by some specific units such as 10 m, 5 km... from given web pages. 我需要从给定的网页中检索数字,后跟一些特定单位,例如10 m,5 km...。 Those specific units are keys of a map<String, Integer> . 这些特定单位是map<String, Integer> keySet() returns a comma separated list, like ["m", "km"...] . keySet()返回以逗号分隔的列表,例如["m", "km"...] Is there a smart way to get the set as disjunction of the variables, like ["m"|"km"|...] so that I could use it in a regex such as: 有没有一种聪明的方法来获取变量集的集合,例如["m"|"km"|...]这样我就可以在正则表达式中使用它,例如:

"(\\d+)"+ " " +"myMap.keySet()......"

用管道将集合加入: "(\\\\d+)\\\\s*(" + StringUtils.join(myMap.keySet(), "|") + ")"

How about 怎么样

myMap.keySet().toString().replaceAll(",\\s*", "|").replaceAll("^\\[|\\]$", "")
//                       ^                         ^
//                       |                         +remove [ at start and ] at end
//                       +replace `,` and spaces after it with |

instead 代替

myMap.keySet()

Your code can look like this 您的代码如下所示

String data = "1km is equal 1000 m, and 1  m is equal 100cm. 1 mango shouldnt be found";

Map<String, Integer> map = new HashMap<>();
map.put("m", 1);
map.put("km", 2);
map.put("cm", 3);

String regex = "\\d+\\s*("
        + map.keySet().toString()       //will create "[cm, m, km]"
            .replaceAll(",\\s*", "|")   //will change it to "[cm|m|km]"
            .replaceAll("^\\[|\\]$", "")//will change it to "cm|m|km"
        + ")\\b";                       
    // I added \\b - word boundary - to prevent matching `m` if it is at
    // start of some word like in 1 mango where it normally would match
    // (1 m)ango

Pattern p=Pattern.compile(regex);
Matcher m=p.matcher(data);
while(m.find()){
    System.out.println(m.group());
}

You can try this: 您可以尝试以下方法:

String p = "\\d+ (?:";
for (String key : yourMap.keySet())
   p += key + "|";
p = p.substring(0, p.length() - 1) + ")";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM