简体   繁体   中英

Regex issue - Matching API name

I am currently working with a large code base, in which recently one of the API's signature changed. So I need to modify thousands of files to get the new feature. So developed a java program to get take all *.java files and look for old API pattern. If found replace it with new pattern.

Old API

API(3,Utils.FIFTY,key1,key4)

New API

API(key1,key4)

So I created a regex pattern to match the old API as API\\([\\d,\\s\\.\\w]*(key[\\.\\w\\s,]*)\\) If it matches it will replace it with

replaceString = matcher.group(1) + "(" + matcher.group(2) + ")";

So with the current code instead of expected API(key1,key4) , I am getting API(key4) . I've analyzed the issue and my inference is that the \\w caught the first key pattern. If we need to match, we need to do a negative look ahead.

Can any one share the best consistent way to resolve the regex issue ?

The FJ's answer doesn't match this test case:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class APIUpdater {
   public static void main( String[] args ) {
      String source = "\n" +
        "API( key.getValue( 18 ),call( key1 ).mth(),key1,key4);\n" +
        "API(\n" +
        "\t3,\n" +
        "\tUtils.FIFTY,\n" +
        "\tkey1,\n" +
        "\tkey4 );\n" +
        "API(3,Utils.FIFTY,key1,key4);\n";
      Pattern p =
         Pattern.compile( "API\\([.\\w\\s,]*?,\\s*(key[\\.\\w\\s,]*)\\)" );
      Matcher m = p.matcher( source );
      while( m.find())
      {
         System.err.println( m.replaceAll( "API(key1,key4)" ));
      }
   }
}

Output is:

API( key.getValue( 18 ),call( key1 ).mth(),key1,key4);
API(key1,key4);
API(key1,key4);

A call on several lines doesn't match but spaces are correctly handled.

A true parser with a grammar is required to parse Java, a regular expressions can't do this complex job because they works at lexical level (the words, not the sentences).

Something like the following should work:

API\([\.\w \t,]*?,\s*(key[\.\w \t,]*)\)

The main change here was to change the repetition on the first character class from * to *? , this means it will now match as few characters as possible instead of as many as possible, so you all of your key arguments will be included in your matching group.

您可能需要尝试Recoder ,它允许您应用源代码转换。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM