简体   繁体   中英

Distinguishing Literals from Variables/Symbols in Source Code

By literals, I means all the constants like

Here 10 is integer literal, 10.5f is floating literal and Hello is a string literal However after trying something I am successful in some part of code.

int a = 10;
float b = 10.5f;
String all = "Hello";

String s = "my source program that i am reading from file";
String lines[] = s.split("\n"); //Break my program into lines
for(int i=0;i<lines.length;i++) {
    if(lines[i].contains("="))
    System.err.println(lines[i].substring(lines[i].indexOf("=")+1),lines[i].indexOf(";"));
}

but it also provides me the output with assignments like:-

Myapp a=new Myapp();

However I need to find only literals

While there are better ways to approach this problem, a quick fix in your existing code would be to make a small tweak :

    String s = "my source program that i am reading from file";
    String lines[] = s.split("\n"); // Break my program into lines
    for (int i = 0; i < lines.length; i++) {
        if (lines[i].contains("=")) {
            String literal = lines[i].substring((lines[i].indexOf("=") + 1), lines[i].indexOf(";"));
            if (!literal.contains("new"))
                System.err.println(literal);
        }
    }

If you really want to find all literals, hook up a java parser or use the "javap" tool to look at the generated class-files. Running it on code that includes these lines:

    int a = 20;
    long b = 10L;
    float c = 1.10E12f;

And using "grep" to choose only those lines that describe long, float, and String, returns

 javap -c Main.class | grep -E "const|push|//" | grep -vE "Field|Method|class"

   0: bipush        20
   2: ldc2_w        #2                  // long 10l
   6: ldc           #4                  // float 1.1E12f

This finds all literals. Even those inside strings, implicit ( i++ ) or somehow quoted. Notice that int literals can only be located via the bipush and iconst_* instructions, as the javap decompiler generates no annotations for them. More on bytecode and constants here

If you are only interested in simple lines of the form <atomicType> <identifier> = <literal>; - then search for them using a regular expression:

    String pattern = 
        "\\s*\\p{Alpha}[\\p{Alnum}_]*\\s+"  + // type with space, eg.: "int "
        "\\p{Alpha}[\\p{Alnum}_]*\\s*=\\s*" + // java identifier with =, eg.: "myVar ="
        "(([-+]?\\s*\\d*\\.?\\d+([eE][-+]?\\d+)?[Lf]?)?|" + // numeric non-hex
        "(\"[^\"]*\"))\\s*;"; // or unquoted string constant
    Pattern p = Pattern.compile(pattern);
    Matcher m = p.matcher(input);
    while (m.find()) {
        String literal = m.group(1);
        System.err.println(literal);
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM