JAVA中的代码上可能有哪些隐藏字符？

Question

I am doing a lex analyzer and am facing some problems. 我正在做一个词法分析器，并且遇到了一些问题。 After reading all the characters from the source code, I put them in a string and I'm reading character by character and doing the proper operations. 从源代码中读取所有字符后，将它们放在字符串中，然后逐个字符地读取并进行适当的操作。 In the end, this generates a list containing language tokens, spaces, breaklines and ... a damn character I can not identify and need to clean. 最后，这将生成一个包含语言标记，空格，换行符和...一个我无法识别且需要清除的可恶字符的列表。

for (int i = 0; i < tokenList.size(); i++) {
    // Remove Espacos
    if (tokenList.get(i).getLexema().equals(" ")) {
        tokenList.remove(i);
    }
    // Remove Strings Vazias
    else if (tokenList.get(i).getLexema().length() == 0) {
        print("ada");
        tokenList.remove(i);
    }
    // Remove Tabulação
    else if (tokenList.get(i).getLexema().equals("\t")) {
        tokenList.remove(i);
    }
    // Remove Quebras de Linha
    else if (tokenList.get(i).getLexema().equals("\n")) {
        print("ASD");
        tokenList.remove(i);
    }
}

From the following entry: 从以下条目：

int a;
char n;

After all the analysis, and cleaning up, I get the following result: 经过所有分析和清理，我得到以下结果：

00 - Lex: int
01 - Lex: a
02 - Lex: ;
03 - Lex: 
04 - Lex: char
05 - Lex: n
06 - Lex: ;

There is an empty space and I do not know how to remove it. 有一个空白空间，我不知道如何删除它。

Answer 1

SOLUTION: 解：

Well, those guys are incredible and I could solve my problem. 好吧，那些家伙真是不可思议，我可以解决我的问题。 The solution, using some better strategies of coding: 该解决方案使用了一些更好的编码策略：

for (int i = 0; i < tokenList.size(); i++) {
    String lexema = tokenList.get(i).getLexema();

    switch (lexema) {
        case "":
            tokenList.remove(i);
            i = i - 1;
            break;
        // Remove Espacos
        case " ":
            tokenList.remove(i);
            i = i - 1;
            break;
        // Remove Tabulações
        case "\t":
            tokenList.remove(i);
            i = i - 1;
            break;
        // Remove Quebras de Linha
        case "\n":
            tokenList.remove(i);
            i = i - 1; // DEIXAR SEM O BREAK
            break;
        // Remove Caractere Estranho
        case "\r":
            tokenList.remove(i);
            i = i - 1;
            break;
        default:
            break;
        }
}

Answer 2

An alternative and easier solution is to use Character.isWhitespace() . 另一种简便的解决方案是使用Character.isWhitespace() 。 So your code could be as simple as: 因此，您的代码可能很简单：

for (int i = 0; i < tokenList.size(); i++) {
    String lexema = tokenList.get(i).getLexema();
    char c = lexema.charAt(0);

    if (Character.isWhitespace(c)) {
        tokenList.remove(i);
        i = i - 1;
    }
}

JAVA中的代码上可能有哪些隐藏字符？

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-10-10 02:18:35

解决方案2
0 2018-10-10 02:26:16

JAVA中的代码上可能有哪些隐藏字符？

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-10-10 02:18:35

解决方案2 0 2018-10-10 02:26:16

解决方案1
0 已采纳 2018-10-10 02:18:35

解决方案2
0 2018-10-10 02:26:16