簡體   English   中英

正則表達式模式元數據

[英]Regular Expression Pattern Metadtata

我想檢查一個正則表達式以識別它包含的匹配組。 以下是我想要的API類型的示例:

String pattern = "^My name is \"([^\"]*)\" and I am (\d*) years old$"
Pattern p = Pattern.compile(pattern)

Group g1 = p.getGroups(0); // Representing the name group
g1.getStartPosition(); // should yeild position in regex string, e.g. 14
g1.getEndPosition();   // 21

Group g2 = p.getGroups(1); // Representing the age group
g2.getStartPosition(); // 34
g2.getEndPosition();   // 39

Java標准的java.util.regex.Pattern沒有提供此功能,但我想知道是否存在任何允許我以這種方式檢查正則表達式的開源庫?

我寧願避免自己動手,嘗試使用java.lang.String API分離正則表達式字符串,因為這樣做特別麻煩。

這不是專業的API,但我建議您嘗試一下此類。 我將其作為練習進行,它具有與Matcher類似的幾種方法,例如: group(int group)start(int group)end(int group)groupCount() 它很容易使用。

import java.util.ArrayList;
import java.util.TreeMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Metamatcher {
    private String pattern;
    private TreeMap<Integer,Integer> groupsIndices;

    Metamatcher(String pattern){
        this.pattern = pattern;
        groupsIndices = getGroups();
    }
    /**
     * @param group ordinal number of group
     * @return starting index of a fragment of pattern, which contain group capturing
     */
    public int start(int group){
        ArrayList<Integer> indices = new ArrayList<Integer>(groupsIndices.keySet());
        indices.add(0,0);
        return indices.get(group);
    }

    /**
     * @param group ordinal number of group
     * @return ending index of a fragment of pattern, which contain group capturing
     */
    public int end(int group){
        ArrayList<Integer> indices = new ArrayList<Integer>(groupsIndices.values());
        indices.add(0,pattern.length());
        return indices.get(group);
    }

    /**
     * @param group ordinal number of group
     * @return String object containing fragment of regular expression which capture given group
     */
    public String group(int group){
        return pattern.substring(start(group), end(group));
    }

    /**
     * @return number of capturing groups within given regular expression
     */
    public int groupCount(){
        return groupsIndices.size();
    }

    public String toString(){
        StringBuilder result = new StringBuilder();
        result.append("Groups count: ")
                .append(groupCount())
                .append("\n");
        for(int i = 0; i <= groupCount(); i++){
            result.append("group(")
                    .append(i).append(") ")
                    .append(start(i))
                    .append("-")
                    .append(end(i))
                    .append("\t")
                    .append(group(i))
                    .append("\n");
        }
        return result.toString();
    }

    /**It extracts fragments of regular expression enclosed by parentheses, checks if these are capturing type,
     * and put start and end indices into Map object
     * @return Map contains fragments of regular expression which capture groups
     */
    private TreeMap<Integer,Integer> getGroups(){
        String copy = pattern;
        Pattern pattern = Pattern.compile("\\([^\\(\\)]+\\)");
        Matcher matcher = pattern.matcher(copy);
        TreeMap<Integer,Integer> temp = new TreeMap<Integer,Integer>();

        while(matcher.find()){
            if(isCapturingGroup(matcher.group(0))){
                temp.put(matcher.start(), matcher.end());
            }
            copy = copy.substring(0,matcher.start()) + replaceWithSpaces(matcher.group(0)) + copy.substring(matcher.end());
            matcher = pattern.matcher(copy);
        }

        return temp;
    }

    /**
     * @param fragment of regular expression, enclosed by brackets
     * @return true if given String consist regular expression which capture groups
     */
    private boolean isCapturingGroup(String fragment){
        return fragment.matches("((?<!\\\\)\\((?!\\?<?[:=!])[^\\(\\)]+\\))");
    }

    /**
     * Provide a filler String composed of spaces, to replace part enclosed by brackets
     * @param part String containing starting and ending with brackets,
     * @return String composed of spaces (' '), with length of part object,
     */
    private String replaceWithSpaces(String part){
        String filler = "";
        for(int i = 0; i < part.length(); i++){
            filler += " ";
        }
        return filler;
    }
}

我用各種輸入進行了測試,並使用regex101之類的工具比較了輸出,它很適合我。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM