繁体   English   中英

换行符如何影响Java中的System.in.read()

[英]How do newlines affect System.in.read() in java

我正在尝试制作一个词法分析器类,该类主要对输入流字符进行标记化,然后使用System.in.read()读取字符。 该文档说,当到达流的末尾时,它returns -1 ,但是,当它具有不同的输入时,此行为有何不同,我无法理解。 例如, delete.txt具有输入:

1. I have
2. bulldoz//er

然后, Lexer将正确的标记化为:

[I=257, have=257, false=259, er=257, bulldoz=257, true=258]  

但是现在如果我使用enter插入一些空行,代码将进入无限循环,代码将检查换行符和空格以进行输入,但是,如何绕过输入呢?

1. I have
2. bulldoz//er
3.    

完整的代码是:

package lexer;

import java.io.*;
import java.util.*;
import lexer.Token;
import lexer.Num;
import lexer.Tag;
import lexer.Word;

class Lexer{
    public int line = 1;
    private  char null_init = ' ';

    private  char tab = '\t';
    private char newline = '\n';
    private char peek = null_init;
    private char comment1 = '/';
    private char comment2 = '*';
    private Hashtable<String, Word> words = new Hashtable<>();

    //no-args constructor
    public Lexer(){
        reserve(new Word(Tag.TRUE, "true"));
        reserve(new Word(Tag.FALSE, "false"));
    }

    void reserve(Word word_obj){
        words.put(word_obj.lexeme, word_obj);
    }

    char read_buf_char() throws IOException {
        char x = (char)System.in.read();
        return x;
    }

    /*tokenization done here*/
    public Token scan()throws IOException{


        for(; ; ){
            // while exiting the loop, sometime the comment
            // characters are read e.g. in bulldoz//er, 
            // which is lost if the buffer is read;
            // so read the buffer i
            peek = read_buf_char();
            if(peek == null_init||peek == tab){
                peek = read_buf_char();
                System.out.println("space is read");
            }else if(peek==newline){
                peek = read_buf_char();
                line +=1;
            }
            else{
                break;
            }
        }

        if(Character.isDigit(peek)){
            int v = 0;
            do{
                v = 10*v+Character.digit(peek, 10);
                peek = read_buf_char();
            }while(Character.isDigit(peek));
            return new Num(v);
        }

        if(Character.isLetter(peek)){
            StringBuffer b = new StringBuffer(32);
            do{
                b.append(peek);
                peek = read_buf_char();
            }while(Character.isLetterOrDigit(peek));

            String buffer_string = b.toString();
            Word reserved_word = (Word)words.get(buffer_string);//returns null if not found

            if(reserved_word != null){
                return reserved_word;
            }

            reserved_word = new Word(Tag.ID, buffer_string);
            // put key value pair in words hashtble
            words.put(buffer_string, reserved_word);
            return reserved_word;
        }

        // if character read is not a digit or a letter,
        // then the character read is a new token

        Token t = new Token(peek);
        peek = ' ';
        return t;

    }

    private char get_peek(){
        return (char)this.peek;
    }

    private boolean reached_buf_end(){
        // reached end of buffer
        if(this.get_peek() == (char)-1){
            return true;
        }
        return false;
    }

    public void run_test()throws IOException{
        //loop checking variable
        //a token object is initialized with dummy value
        Token new_token = null;
        // while end of stream has not been reached
        while(this.get_peek() != (char)-1){
            new_token = this.scan();

        }

        System.out.println(words.entrySet());
    }


    public static void main(String[] args)throws IOException{
        Lexer tokenize = new Lexer();
        tokenize.run_test();
    }

}

get_peek函数获取具有当前输入缓冲区字符的peek的值。
run_test函数中检查是否到达缓冲区末端。
主要处理在scan()函数中完成。

我使用以下命令: cat delete.txt|java lexer/Lexer将文件提供为已编译Java类的输入。 请告诉我,此代码与添加了换行符的输入文件一起进行的是无限循环吗?

我不确定您如何检查流(-1)的结尾。 在scan()的末尾,您正在为空间分配“窥视”,我认为如果有空白行,您将无法捕捉-1,这将变得很混乱。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM