换行符如何影响Java中的System.in.read（）

Question

我正在尝试制作一个词法分析器类，该类主要对输入流字符进行标记化，然后使用System.in.read()读取字符。 该文档说，当到达流的末尾时，它returns -1 ，但是，当它具有不同的输入时，此行为有何不同，我无法理解。 例如， delete.txt具有输入：

1. I have
2. bulldoz//er

然后， Lexer将正确的标记化为：

[I=257, have=257, false=259, er=257, bulldoz=257, true=258]

但是现在如果我使用enter插入一些空行，代码将进入无限循环，代码将检查换行符和空格以进行输入，但是，如何绕过输入呢？ ：

1. I have
2. bulldoz//er
3.

完整的代码是：

package lexer;

import java.io.*;
import java.util.*;
import lexer.Token;
import lexer.Num;
import lexer.Tag;
import lexer.Word;

class Lexer{
    public int line = 1;
    private  char null_init = ' ';

    private  char tab = '\t';
    private char newline = '\n';
    private char peek = null_init;
    private char comment1 = '/';
    private char comment2 = '*';
    private Hashtable<String, Word> words = new Hashtable<>();

    //no-args constructor
    public Lexer(){
        reserve(new Word(Tag.TRUE, "true"));
        reserve(new Word(Tag.FALSE, "false"));
    }

    void reserve(Word word_obj){
        words.put(word_obj.lexeme, word_obj);
    }

    char read_buf_char() throws IOException {
        char x = (char)System.in.read();
        return x;
    }

    /*tokenization done here*/
    public Token scan()throws IOException{


        for(; ; ){
            // while exiting the loop, sometime the comment
            // characters are read e.g. in bulldoz//er, 
            // which is lost if the buffer is read;
            // so read the buffer i
            peek = read_buf_char();
            if(peek == null_init||peek == tab){
                peek = read_buf_char();
                System.out.println("space is read");
            }else if(peek==newline){
                peek = read_buf_char();
                line +=1;
            }
            else{
                break;
            }
        }

        if(Character.isDigit(peek)){
            int v = 0;
            do{
                v = 10*v+Character.digit(peek, 10);
                peek = read_buf_char();
            }while(Character.isDigit(peek));
            return new Num(v);
        }

        if(Character.isLetter(peek)){
            StringBuffer b = new StringBuffer(32);
            do{
                b.append(peek);
                peek = read_buf_char();
            }while(Character.isLetterOrDigit(peek));

            String buffer_string = b.toString();
            Word reserved_word = (Word)words.get(buffer_string);//returns null if not found

            if(reserved_word != null){
                return reserved_word;
            }

            reserved_word = new Word(Tag.ID, buffer_string);
            // put key value pair in words hashtble
            words.put(buffer_string, reserved_word);
            return reserved_word;
        }

        // if character read is not a digit or a letter,
        // then the character read is a new token

        Token t = new Token(peek);
        peek = ' ';
        return t;

    }

    private char get_peek(){
        return (char)this.peek;
    }

    private boolean reached_buf_end(){
        // reached end of buffer
        if(this.get_peek() == (char)-1){
            return true;
        }
        return false;
    }

    public void run_test()throws IOException{
        //loop checking variable
        //a token object is initialized with dummy value
        Token new_token = null;
        // while end of stream has not been reached
        while(this.get_peek() != (char)-1){
            new_token = this.scan();

        }

        System.out.println(words.entrySet());
    }


    public static void main(String[] args)throws IOException{
        Lexer tokenize = new Lexer();
        tokenize.run_test();
    }

}

get_peek函数获取具有当前输入缓冲区字符的peek的值。
在run_test函数中检查是否到达缓冲区末端。
主要处理在scan()函数中完成。

我使用以下命令： cat delete.txt|java lexer/Lexer将文件提供为已编译Java类的输入。 请告诉我，此代码与添加了换行符的输入文件一起进行的是无限循环吗？

Answer 1

我不确定您如何检查流（-1）的结尾。 在scan（）的末尾，您正在为空间分配“窥视”，我认为如果有空白行，您将无法捕捉-1，这将变得很混乱。

换行符如何影响Java中的System.in.read（）

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-09-06 14:57:44

换行符如何影响Java中的System.in.read（）

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-09-06 14:57:44

解决方案1
1 已采纳 2019-09-06 14:57:44