I am compressing a text file using Huffman code I generated then I converted all characters to string of 0's and 1's. Wrote them in file using following code. (Input was 1011001110010011
)
public static void writeToFile(String binaryString, BufferedWriter writer) throws IOException{
int pos = 0;
while(pos < binaryString.length()){
byte nextByte = 0x00;
for(int i=0;i<8 && pos+i < binaryString.length(); i++){
nextByte = (byte) (nextByte << 1);
nextByte += binaryString.charAt(pos+i)=='0'?0x0:0x1;
}
writer.write(nextByte);
pos+=8;
}
}
Then I tried to regenerate previous binary string 1011001110010011
from the file I just created, using following code
data = Files.readAllBytes(path);
for(int i=0;i<data.length;i++){
byte nextByte = data[i];
String tempString = "";
for(int j=0;j<8; j++){
byte temp = (byte) (0x1 & nextByte);
if(temp==0x1){
tempString="1".concat(tempString);
}else if(temp==0x0){
tempString="0".concat(tempString);
}
nextByte = (byte) (nextByte >> 1);
}
binary=binary.concat(tempString);
}
But I got 111011111011111010110011111011111011111010010011
in output, I was just expecting some attached 0's.
Edit: made change in from string to binary code, now its adding 0's at end to complete byte.
public static void writeToFile(String binaryString, BufferedWriter writer) throws IOException{
int pos = 0;
while(pos < binaryString.length()){
byte nextByte = 0x00;
for(int i=0;i<8; i++){
nextByte = (byte) (nextByte << 1);
if(pos+i < binaryString.length())
nextByte += binaryString.charAt(pos+i)=='0'?0x0:0x1;
}
writer.write(nextByte);
pos+=8;
}
}
The problem is that BufferedWriter.write()
writes a char
, not a byte
. Whenever you're writing to the file, you're writing a variable-sized unicode character, not a single byte
, so you're ending up with much more stored in your file than you were expecting.
You want to use
new BufferedOutputStream(new FileOutputStream("filename"))
instead, and change the signature of your method to take an OutputStream
.
(You might notice that OutputStream.write()
takes an int
rather than a byte
, but that is just there to confuse you... it actually writes only the low-order byte, rather than the whole int
, so it does what you want.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.