如何使用正則表達式捕獲字符串的一部分？

Question

（在Java中）我想創建一個使用正則表達式提取字符串部分的函數：

public HashMap<Integer,String> extract(String sentence, String expression){
}

//例如，我需要發送一個這樣的句子：

HashMap<Integer,String> parts =extract("hello Jhon how are you", "(hello|hi) @1 how are @2");

//表達式有效：句子必須以hello或hi開頭，接下來是一個單詞或一組單詞，接下來是單詞“ how are”，接下來是其他單詞// //我想得到這個：

parts.get(1) --> "Jhon"
parts.get(2) --> "you"

//但是如果我給出此函數，此函數將返回null：

extract("any other words","hello @1 how are @2");

我當時沒有正則表達式，但是代碼變大了，我不確定使用正則表達式以獲得更快的處理效果是否更好，以及如何使用正則表達式來做到這一點。

Answer 1

感謝@ajb的評論。 我已經修改了我的問題以滿足Omar的要求。 它比我想的還要復雜，大聲笑。

我認為Omar要使用他提供的正則表達式來捕獲特定單詞。 他使用@ 1，@ 2 ... @n表示他要捕獲的內容，並且整數值也是從地圖檢索目標的關鍵。

編輯，OP要將@n放在括號中，我將對該表達式進行預處理，以將“（”替換為“（？：”。如果是這種情況，該組仍然會生效，但不會被捕獲。

import java.util.ArrayList;
import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String args[]){

        Test test = new Test();
        String sentence1 = "whats the number of apple";
        String expression1 = "whats the (number of @1|@1s number)";
        HashMap<Integer, String> map1 = test.extract(sentence1, expression1);
        System.out.println(map1);
        String sentence2 = "whats the bananas number";
        HashMap<Integer, String> map2 = test.extract(sentence2, expression1);
        System.out.println(map2);
        String sentence3 = "hello Jhon how are you";
        String expression3 = "(hello|hi) @1 how are @2";
        HashMap<Integer, String> map3 = test.extract(sentence3, expression3);
        System.out.println(map3);
    }

    public HashMap<Integer,String> extract(String sentence, String expression){
        expression = expression.replaceAll("\\(", "\\(?:");
        ArrayList<Integer> keys = new ArrayList<Integer>();
        String regex4Expression = "@([\\d]*)";
        Pattern pattern4Expression = Pattern.compile(regex4Expression);
        Matcher matcher4Expression = pattern4Expression.matcher(expression);
        while(matcher4Expression.find()){
            for(int i = 1; i <= matcher4Expression.groupCount(); i++){
                if(!keys.contains(Integer.valueOf(matcher4Expression.group(i)))){
                    keys.add(Integer.valueOf(matcher4Expression.group(i)));
                }
            }
        }
        String regex = expression.replaceAll("@[\\d]*", "([\\\\w]*)");
        HashMap<Integer, String> map = new HashMap<Integer, String>();
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(sentence);

        while(matcher.find()){
            ArrayList<String> targets = new ArrayList<String>();
            for(int i = 1; i <= matcher.groupCount(); i++){
                if(matcher.group(i) != null){
                    targets.add(matcher.group(i));
                }
            }
            for(int j = 0; j < keys.size(); j++){
                map.put(j + 1, targets.get(j));
            }
        }
        return map;
    } 
}

結果如下

{1=apple}
{1=banana}
{1=Jhon, 2=you}

如何使用正則表達式捕獲字符串的一部分？

問題描述

1 個解決方案

解決方案1
1 已采納 2017-07-01 05:29:22

如何使用正則表達式捕獲字符串的一部分？

問題描述

1 個解決方案

解決方案1 1 已采納 2017-07-01 05:29:22

解決方案1
1 已采納 2017-07-01 05:29:22