简体   繁体   中英

Java - An extra character in my code?

I'm using 3 classes: the Character class, the Scanner class, and the Test class.

This is the Character class:

public class Character {
    private char cargo = '\u0007'; 
    private String sourceText = ""; 
    private int sourceIndex = 0; 
    private int lineIndex = 0;
    private int columnIndex = 0;
    public Character(String sourceText, char cargo, int sourceIndex, int lineIndex, int columnIndex) {
        this.sourceText = sourceText;
        this.cargo = cargo;
        this.sourceIndex = sourceIndex;
        this.lineIndex = lineIndex;
        this.columnIndex = columnIndex;
    }
    /*****************************************************************************************/
    /* Returns the String representation of the Character object                      */
    /*****************************************************************************************/
    @Override
    public String toString() {
        switch (cargo) {
            case ' ': return String.format("%6d %-6d " + "    blank", lineIndex, columnIndex);
            case '\t': return String.format("%6d %-6d " + "    tab", lineIndex, columnIndex);
            case '\n': return String.format("%6d %-6d " + "    newline", lineIndex, columnIndex);
            default: return String.format("%6d %-6d " + cargo, lineIndex, columnIndex);
        }
    }
}

Here's my Scanner class:

public class Scanner {
    private String sourceText = ""; 
    private int sourceIndex = -1; 
    private int lineIndex = 0;
    private int columnIndex = -1;
    private int lastIndex = 0;
    /*****************************************************************************************/
    /* Assign proper values                                                                  */
    /*****************************************************************************************/ 
    public Scanner(String sourceText) {
        this.sourceText = sourceText;
        lastIndex = sourceText.length() - 1;
    }
    /*****************************************************************************************/
    /* Returns the next character in the source text                                         */
    /*****************************************************************************************/   
    public Character getNextCharacter() {
        if (sourceIndex > 0 && sourceText.charAt(sourceIndex - 1) == '\n') {
            ++lineIndex;
            columnIndex = -1;
        }
        ++sourceIndex;
        ++columnIndex;
        char currentChar = sourceText.charAt(sourceIndex);
        Character objCharacter = new Character(sourceText, currentChar, sourceIndex, lineIndex, columnIndex);
        return objCharacter;
    }
}

And this is the Test class's main method:

public static void main(String[] args) {
    String sourceText = "";
    String filePath = "D:\\Somepath\\SampleCode.dat";
    try { sourceText = readFile(filePath, StandardCharsets.UTF_8); }
    catch (IOException io) { System.out.println(io.toString()); }
    LexicalAnalyzer.Scanner sca = new LexicalAnalyzer.Scanner(sourceText);
    LexicalAnalyzer.Character cha;
    int i =0;
    while(i < sourceText.length()) {
        cha = sca.getNextCharacter();
        System.out.println(cha.toString());
        i++;
    }
}

Basically, what I'm trying to do is print each character (including spaces, tabs, and newlines) in my source file, along with other character details such as line number and column number. Also, please note my switch and case statements in the toString() method of the Character class.

Let's say, for example, my file contains the text:

This is line #1. 
This is line #2.

From my code, I'm expecting to get:

 0 0      T
 0 1      h
 0 2      i
 0 3      s
 0 4          blank
 0 5      i
 0 6      s
 0 7          blank
 0 8      l
 0 9      i
 0 10     n
 0 11     e
 0 12         blank
 0 13     #
 0 14     1
 0 15     .
 0 16         newline
 1 0      T
 1 1      h
 1 1      i
 1 2      s
 1 3          blank
 1 4      i
 1 5      s
 1 6          blank
 1 7      l
 1 8      i
 1 9      n
 1 10     e
 1 11         blank
 1 12     #
 1 13     2
 1 14     .

However, I'm getting:

 0 0      T
 0 1      h
 0 2      i
 0 3      s
 0 4          blank
 0 5      i
 0 6      s
 0 7          blank
 0 8      l
 0 9      i
 0 10     n
 0 11     e
 0 12         blank
 0 13     #
 0 14     1
 0 15     .
 0 16     
 0 17         newline
 0 18     T
 1 0      h
 1 1      i
 1 2      s
 1 3          blank
 1 4      i
 1 5      s
 1 6          blank
 1 7      l
 1 8      i
 1 9      n
 1 10     e
 1 11         blank
 1 12     #
 1 13     2
 1 14     .

Notice what it prints when there'a newline character. Space and tab characters work fine. I get what I want, but not for newline. BTW, this is just a Java code of this: http://parsingintro.sourceforge.net/#contents_item_4.2 .

Please don't attack me. I've been trying to find out the reason behind this for hours and hours.


Note

Using the %n in String.format or System.getProperty("line.separator"); might help, too. Check this link: How do I get a platform-dependent new line character?

You're running on a Windows system.

The code doesn't handle a newline in the form of \\r\\n , just \\n .

I was able to produce output that makes sense with this change. Add this case to the switch:

case '\r': return String.format("%6d %-6d " + "    winNewline", lineIndex, columnIndex);

Resulting output:

 0 0      T
 0 1      h
 0 2      i
 0 3      s
 0 4          blank
 0 5      i
 0 6      s
 0 7          blank
 0 8      l
 0 9      i
 0 10     n
 0 11     e
 0 12         blank
 0 13     #
 0 14     1
 0 15     .
 0 16         blank
 0 17         winNewline
 0 18         newline
 0 19     T
 1 0      h
 1 1      i
 1 2      s
 1 3          blank
 1 4      i
 1 5      s
 1 6          blank
 1 7      l
 1 8      i
 1 9      n
 1 10     e
 1 11         blank
 1 12     #
 1 13     2
 1 14     .

Process finished with exit code 0

It's hard to tell by looking at your output, but to try and debug this you can try and modify your default case statement in your character class to print the ascii code of the char using

default: return String.format("%6d %-6d " + Integer.valueOf(cargo), lineIndex, columnIndex);

That will show you what the ascii code of the extra char you're getting is. Once you get the code check which char it is here: http://www.asciitable.com/

My guess is that the extra char you are getting is a '\\r' (different type of '\\n' char).

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM