简体   繁体   中英

How can I find a String within a Java program converted to a string?

Basically, I read a java program into my program as a string, and I'm trying to find a way to extract strings from this. I have a loop counting through each character of this program, and this is what happens when it reaches a '"'.

else if (ch == '"')
            {
                String subString = " ";
                index ++;

                if (ch != '"')
                {
                    subString += ch;
                }

                else
                {
                    System.out.println(lineNumber + ", " + TokenType.STRING + ", " + subString);
                    index ++;
                    continue;
                }

Unfortunately, this isn't working. This is the way I am trying to output the subString.

Essentially, I am looking for a way to add all the characters in between two "s together in order to get a String.

You could use regular expressions:

Pattern regex = Pattern.compile("(?:(?!<')\"(.*?(?<!\\\\)(?:\\\\\\\\)*)\")");
Matcher m = regex.matcher(content);
while (m.find())
    System.out.println(m.group(1));

This will capture quoted strings, and takes account of escaped quotes/backslashes.

To break down the pattern:

  1. (?: ... ) = don't capture as a group (the inside is captured instead)
  2. (?!<') = make sure there isn't a single quote before (to avoid '"')
  3. \\"( ... )\\" = capture what is inside the quotes
  4. .*? = match the minimum of string of any chars
  5. (?<!\\\\\\\\) = don't match single backslash before (double-escape = single backslash in content)
  6. (?\\\\\\\\\\\\\\\\)* = match 0 or even number of backslashes

Together, 5. & 6. only match an even number of backslashes before the quote. This allows string endings like \\\\" , \\\\\\\\" , but not \\" and \\\\\\" , which would be part of the string.

Non-regex solution, also taking care of escaped quotes:

List<String> strings = new ArrayList<>();
int start = -1;
int backslashes = 0;
for (int i = 0; i < content.length(); i++) {
    char ch = content.charAt(i);
    if (ch == '"') {
        if (start == -1) {
            start = i + 1;
            backslashes = 0;
        } else if (backslashes % 2 == 1) {
            backslashes = 0;
        } else {
            strings.add(content.substring(start, i));
            start = -1;
        }
    } else if (ch == '\\') backslashes++;
}
strings.forEach(System.out::println);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM