简体   繁体   English

霍夫曼算法需要帮助存储字符代码

[英]huffman algorithm needs help in storing char code

I have written program to obtain the code for each char as shown in the program output. 我已经编写了程序,以获取程序输出中所示的每个字符的代码。

enter some text: nslfaslfjasfj text = "nslfaslfjasfj" 输入一些文本:nslfaslfjasfj text =“ nslfaslfjasfj”

a:2 a2

f:3 f:3

j:2 j:2

l:2 l:2

n:1 n:1

s:3 s:3

Huffman Algorithm Below is "CHAR CODE": 霍夫曼算法下面是“ CHAR CODE”:

n code:111 编号:111

j code:110 编号:110

f code:10 f码:10

s code:01 代码:01

l code:001 l码:001

a code:000 码:000

My next step should be storing the above in structure and comparing it my original text = "nslfaslfjasfj" to encode as "11101.....so on". 我的下一步应该是将上述内容存储在结构中,并将其与我的原始文本=“ nslfaslfjasfj”进行比较,以编码为“ 11101 ..... so on”。

I am finding a problem in storing "CHAR CODE" in structure. 我在结构中存储“ CHAR CODE”时发现问题。 Should it be stored as string like string s="111" and then stored in the strucuture?.. Thanks in advance. 是否应将其存储为字符串s =“ 111”之类的字符串,然后存储在结构中?..在此先感谢。

Usually the point of Huffman encoding is to decrease the length of the message, ie compress it. 通常,霍夫曼编码的目的是减少消息的长度,即压缩消息。 This means you want to be writing bits out, not the characters '0' and '1'. 这意味着您要写入位,而不是字符“ 0”和“ 1”。 Therefore it makes sense to store your character codes also as bits, and use bit operations to transfer them to the stream. 因此,将字符代码也存储为位,并使用位操作将其传输到流中是很有意义的。 Storing a pair (character code, code length) for each element is sufficient to construct the encoding. 为每个元素存储一对(字符代码,代码长度)足以构成编码。

Having said that, you can do it with strings as you suggest. 话虽如此,您可以按照建议使用字符串来完成。 It's not wrong, and it might make it a little easier to debug, but it will perform worse. 没错,它可能会使调试起来更容易一些,但性能会更差。

You're going to need to make some kind of "BitWriter". 您将需要制作某种“ BitWriter”。 We covered the topic of bitwise I/O in Java for huffman coding in a data structures class at my university, the lecture slides are freely available here . 我们在我的大学的数据结构课程中介绍了Java中按位I / O进行霍夫曼编码的主题, 此处免费提供讲座幻灯片 Obviously Java != C, but the concept is the same. 显然Java!= C,但是概念是相同的。

Considering you're trying to compress something, using strings would be a horrible idea. 考虑到您尝试压缩某些东西,使用字符串将是一个可怕的想法。 You'll want to store the char codes as raw binary. 您需要将char代码存储为原始二进制文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM