简体   繁体   English

当每个字符的位数变化时,如何从BitInputStream中读取每个字符

[英]How to read each character from a BitInputStream when the number of bits varies per character

I apologize, it wouldn't let me tag homework. 我很抱歉,这不会让我标记作业。

Hello there, I'm working on a school project using Huffman coding to compress data from a file. 您好,我正在使用Huffman编码来压缩文件中的数据的学校项目。 In this assignment, you are supposed to read from a file using a BitInputStream object, which I'm not sure is in the JCL because the documentation the professor provided has spelling errors and isn't very explicit with certain things. 在此作业中,应该使用BitInputStream对象从文件中读取数据,我不确定该对象位于JCL中,因为教授提供的文档存在拼写错误,并且在某些方面不是很明确。 Anyway, it seems to work about the same as other classes that extend InputStream. 无论如何,它似乎与扩展InputStream的其他类大致相同。 The lines of code I keep getting from the class forum are the following: 我从类论坛中不断得到的代码行如下:

        try {
            BitInputStream b = new BitInputStream(in);
            int data;

            while((data  = b.readBits(BITS_PER_WORD)) != -1) {
                data = b.readBits(BITS_PER_WORD);
                q.freq[data]++; //instance variable (size 256) in PriorityQueue q to 
                //count number of occurrences of each piece of data.
                System.out.println(data);
            }
        }catch(FileNotFoundException e) {
            System.out.println("File not found.");
        }
        catch(IOException e) {
            System.out.println("Error while reading file.");
        }

...where @param in is the generic input stream object and BITS_PER_WORD = 8, inherited from an interface of constants. ...其中@param in是通用输入流对象,BITS_PER_WORD = 8,是从常量接口继承的。 The problem is that whenever I run this, it appears to skip every other character in the file, starting with the first. 问题是,每当我运行此命令时,它似乎都会从第一个字符开始跳过文件中的所有其他字符。 So, for example, the small .txt file containing "Eerie eyes seen near lake." 因此,例如,一个小的.txt文件包含“在湖边看到的怪异的眼睛”。 would print: 101 105 32 121 115 115 101 32 101 114 108 107 46 10 ('e', 'i', ' ', 'y', etc..). 会打印:101 105 32 121 115 115 101 32 101 114 108 107 46 10(“ e”,“ i”,“',”,“ y”等)。 I imagine this has something to do with trying to read 8 bits at a time, as the ascii value of 'a' for example, in bits, is 1100001 (7 bits) and space is 100000 (6 bits). 我想这与尝试一次读取8位有关,例如,以位为单位的“ a”的ascii值为1100001(7位),而空间为100000(6位)。 I was wondering if I have to somehow vary the number of bits it's trying to read (and how on earth I would do that) or if I'm coming at this the wrong way (I've only recently gotten used to the idea of working with bits/bytes and there may be something important I don't know). 我想知道我是否必须以某种方式改变它试图读取的位数(以及到底该怎么做)还是我走错路了(我最近才习惯于使用位/字节,可能有些重要的东西我不知道)。

I apologize for the lengthy question, but let me know if I left out any important info. 对于冗长的问题,我深表歉意,但是如果我遗漏了任何重要信息,请告诉我。 Thanks! 谢谢!

You seem to be calling readBits() an extra time in the while initialisation. 您似乎在初始化时会额外打电话给readBits() This might be why it is skipping letters. 这可能就是为什么它跳过字母。 You should have something like: 您应该具有以下内容:

while(data != -1) {
  data = b.readBits(BITS_PER_WORD);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM