简体   繁体   中英

Extracting Operation(…); and sub Operation from String using REGEX

I have an issue with a Regex in java for Android. i would like to retreive the first operation (and each sub operations) like in the following samples:

  1. "OPERATION(ASYNC_OPERATION,_RFID_ITEM_SERIAL);"
  2. "OPERATION(CONCAT,~1261,01,OPERATION(ASYNC_OPERATION,_RFID_ITEM_ID);,21,OPERATION(ASYNC_OPERATION,_RFID_ITEM_SERIAL););"

As you can see each Operation can have sub Operations... And that's where i'm getting problems.

Actually i am using this Regex: ^\s*(OPERATION\s*\(\s*)(.*)(\);)

but the index of ");"returned is always the last index, and in case of two sub operations, inside of a "Main" operation, this is wrong...

private static Pattern operationPattern=Pattern.compile("^\\s*(OPERATION\\s*\\(\\s*)(.*)(\\);)",Pattern.CASE_INSENSITIVE);

    public Operation(String text){
        parseOperationText(text);
    }

    private void parseOperationText(String text){
        String strText = text.replace("#,", "§");
        Matcher matcher=operationPattern.matcher(strText);
        if(matcher.find()) {
            //This is an OPERATION
            subOperations=new ArrayList<>();
            String strChain = matcher.group(2);//Should only contain the text between "OPERATION(" and ");"

            int commaIdx = strChain.indexOf(",");
            if (commaIdx == -1) {
                //Operation without parameter
                operationType = strChain;
            } else {
                //Operation with parameters
                operationType = strChain.substring(0, commaIdx);
                strChain = strChain.substring(commaIdx + 1);
                while (strChain.length()>0) {
                    matcher = operationPattern.matcher(strChain);
                    if (matcher.find()) {
                        String subOpText=matcher.group(0);
                        strChain=StringUtils.stripStart(strChain.substring(matcher.end())," ");
                        if(strChain.startsWith(",")){
                            strChain=strChain.substring(1);
                        }
                        subOperations.add(new Operation(subOpText));
                    }
                    else{
                        commaIdx = strChain.indexOf(",");
                        if(commaIdx==-1)
                        {
                            subOperations.add(new Operation(strChain));
                            strChain="";
                        }
                        else{
                            subOperations.add(new Operation(strChain.substring(0,commaIdx)));
                            strChain=strChain.substring(commaIdx+1);
                        }
                    }
                }
            }
        }
        else {
            //Not an operation
            //...
        }
    }

It works for sample 1 but for Sample 2, after finding the "Main" operation (CONCAT in the sample), the second match returns this:

OPERATION(ASYNC_OPERATION,_RFID_ITEM_ID);,21,OPERATION(ASYNC_OPERATION,_RFID_ITEM_SERIAL);

What i would like to retrieve is this:

  1. "CONCAT,~1261,01,OPERATION(ASYNC_OPERATION,_RFID_ITEM_ID);,21,OPERATION(ASYNC_OPERATION,_RFID_ITEM_SERIAL);"
  2. "ASYNC_OPERATION,_RFID_ITEM_ID"
  3. "ASYNC_OPERATION,_RFID_ITEM_SERIAL"

Could use this

"(?s)(?=OPERATION\\s*\\()(?:(?=.*?OPERATION\\s*\\((?.?*.\\1)(?*\\)(..?*\\2).*))(?=?*.\\)(?..*?\\2)(.*))?)+??*:(?=\\1)(.?(?!OPERATION\\s*\\().)*(?=\\2$)"

to find the balanced OPERATION( ) string in group 0.

https://regex101.com/r/EsaDtC/1

Then use this

(?s)^OPERATION\((.*?)\)$

on that last matched string to get the inner contents of the
operation, which is in group 1.

Finally i'm using two different REGEX:

//First Regex catches main operation content (Group 2):  
\s*(OPERATION\s*\(\s*)(.*)(\);) 
//Second Regex catches next full sub "OPERATION(...);" (Group 0):
^(?:\s*(OPERATION\s*\(\s*))(.*)(?:\)\s*\;\s*)(?=\,)|^(?:\s*(OPERATION\s*\(\s*))(.*)(?:\)\s*\;\s*)$ 

Then i can use Frist Regex to detect if this is an operation (match.find()), catch it's content in Group(2) and then for each param (separated by comma) i can check if it's a sub operation with second regex. If it's a sub Operation i call recurcively the same function that uses First Regex again... and so on.

    private static Pattern operationPattern=Pattern.compile("^\\s*(OPERATION\\s*\\(\\s*)(.*)(\\);)",Pattern.CASE_INSENSITIVE);
    private static Pattern subOperationPattern=Pattern.compile("^(?:\\s*(OPERATION\\s*\\(\\s*))(.*)(?:\\)\\s*\\;\\s*)(?=\\,)|^(?:\\s*(OPERATION\\s*\\(\\s*))(.*)(?:\\)\\s*\\;\\s*)$",Pattern.CASE_INSENSITIVE);

    private void parseOperationText(String strText ){
        Matcher matcher=operationPattern.matcher(strText);
        if(matcher.find()) {
            //This is an OPERATION
            subOperations=new ArrayList<>();
            String strChain = matcher.group(2);
            int commaIdx = strChain.indexOf(",");
            if (commaIdx == -1) {
                //Operation without parameter
                operationType = strChain;
            } else {
                //Operation with parameters
                operationType = strChain.substring(0, commaIdx);
                strChain = strChain.substring(commaIdx + 1);
                while (strChain.length()>0) {
                    matcher = subOperationPattern.matcher(strChain);
                    if (matcher.find()) {
                        String subOpText=matcher.group(0);
                        strChain=StringUtils.stripStart(strChain.substring(matcher.end())," ");
                        if(strChain.startsWith(",")){
                            strChain=strChain.substring(1);
                        }
                        subOperations.add(new Operation(subOpText));
                    }
                    else{
                        commaIdx = strChain.indexOf(",");
                        if(commaIdx==-1)
                        {
                            subOperations.add(new Operation(strChain));
                            strChain="";
                        }
                        else{
                            subOperations.add(new Operation(strChain.substring(0,commaIdx)));
                            strChain=strChain.substring(commaIdx+1);
                        }
                    }
                }
            }
        }
        else {
            //Fixed value: we store the value as is
            fieldValue = strText;
            operationType = OperationType.NONE;
        }
    }

public Operation(String text){
        parseOperationText(text);
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM