[英]how can I capture part of a string using regular expressions?
(in java) I want to create a function to extract parts of a string using regular expressions: (在Java中)我想创建一个使用正则表达式提取字符串部分的函数:
public HashMap<Integer,String> extract(String sentence, String expression){
}
//I need to send a sentence like this for example: //例如,我需要发送一个这样的句子:
HashMap<Integer,String> parts =extract("hello Jhon how are you", "(hello|hi) @1 how are @2");
// the expression validates: the sentence must start with hello or hi, next a word or group of words, next the words: "how are" and next other words extra // And I want to get this: //表达式有效:句子必须以hello或hi开头,接下来是一个单词或一组单词,接下来是单词“ how are”,接下来是其他单词// //我想得到这个:
parts.get(1) --> "Jhon"
parts.get(2) --> "you"
//but this function return null if I give this: //但是如果我给出此函数,此函数将返回null:
extract("any other words","hello @1 how are @2");
I was doing it without regular expressions but the code became a little large and I'm not sure if it would be better use regular expressions to get a faster process and how could i do it with regular expressions. 我当时没有正则表达式,但是代码变大了,我不确定使用正则表达式以获得更快的处理效果是否更好,以及如何使用正则表达式来做到这一点。
Thanks for @ajb 's comment. 感谢@ajb的评论。 I've modified my question to meet Omar's requirement.
我已经修改了我的问题以满足Omar的要求。 It's more complicated than what I think, lol.
它比我想的还要复杂,大声笑。
I assume Omar wants to use regular expression he provided to capture specific word. 我认为Omar要使用他提供的正则表达式来捕获特定单词。 He uses @1, @2 ... @n to represent what he wants to capture and the integer value is also the key to retrieve the target from a map.
他使用@ 1,@ 2 ... @n表示他要捕获的内容,并且整数值也是从地图检索目标的关键。
Edit, the OP wants to put the @n into parenthese, I will preprocess the expression to replace "(" with "(?:". If this is the case, the group will still take effect but not for capture. 编辑,OP要将@n放在括号中,我将对该表达式进行预处理,以将“(”替换为“(?:”。如果是这种情况,该组仍然会生效,但不会被捕获。
import java.util.ArrayList;
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String args[]){
Test test = new Test();
String sentence1 = "whats the number of apple";
String expression1 = "whats the (number of @1|@1s number)";
HashMap<Integer, String> map1 = test.extract(sentence1, expression1);
System.out.println(map1);
String sentence2 = "whats the bananas number";
HashMap<Integer, String> map2 = test.extract(sentence2, expression1);
System.out.println(map2);
String sentence3 = "hello Jhon how are you";
String expression3 = "(hello|hi) @1 how are @2";
HashMap<Integer, String> map3 = test.extract(sentence3, expression3);
System.out.println(map3);
}
public HashMap<Integer,String> extract(String sentence, String expression){
expression = expression.replaceAll("\\(", "\\(?:");
ArrayList<Integer> keys = new ArrayList<Integer>();
String regex4Expression = "@([\\d]*)";
Pattern pattern4Expression = Pattern.compile(regex4Expression);
Matcher matcher4Expression = pattern4Expression.matcher(expression);
while(matcher4Expression.find()){
for(int i = 1; i <= matcher4Expression.groupCount(); i++){
if(!keys.contains(Integer.valueOf(matcher4Expression.group(i)))){
keys.add(Integer.valueOf(matcher4Expression.group(i)));
}
}
}
String regex = expression.replaceAll("@[\\d]*", "([\\\\w]*)");
HashMap<Integer, String> map = new HashMap<Integer, String>();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(sentence);
while(matcher.find()){
ArrayList<String> targets = new ArrayList<String>();
for(int i = 1; i <= matcher.groupCount(); i++){
if(matcher.group(i) != null){
targets.add(matcher.group(i));
}
}
for(int j = 0; j < keys.size(); j++){
map.put(j + 1, targets.get(j));
}
}
return map;
}
}
The result is as below 结果如下
{1=apple}
{1=banana}
{1=Jhon, 2=you}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.