简体   繁体   English

是否可以使用 JAVA 从文件中读取/写入位?

[英]Is it possible to read/write bits from a file using JAVA?

To read/write binary files, I am using DataInputStream/DataOutputStream, they have this method writeByte()/readByte(), but what I want to do is read/write bits?为了读/写二进制文件,我使用DataInputStream/DataOutputStream,他们有这个方法writeByte()/readByte(),但我想做的是读/写位? Is it possible?是否可以?

I want to use it for a compression algorithm, so when I am compressing I want to write 3 bits(for one number and there are millions of such numbers in a file) and if I write a byte at everytime I need to write 3 bits, I will write loads of redundant data...我想将它用于压缩算法,所以当我压缩时,我想写 3 位(对于一个数字,文件中有数百万个这样的数字),如果我每次都写一个字节,我需要写 3 位,我会写大量的冗余数据......

It's not possible to read/write individual bits directly, the smallest unit you can read/write is a byte.无法直接读取/写入单个位,您可以读取/写入的最小单位是一个字节。

You can use the standard bitwise operators to manipulate a byte though, so eg to get the lowest 2 bits of a byte, you'd do您可以使用标准的按位运算符来操作一个字节,例如,要获得一个字节的最低 2 位,您可以这样做

byte b = in.readByte();
byte lowBits = b&0x3;

set the low 4 bits to 1, and write the byte:将低 4 位设置为 1,并写入字节:

b |= 0xf;
out.writeByte(b);

(Note, for the sake of efficiency you might want to read/write byte arrays and not single bytes) (注意,为了效率起见,您可能想要读/写字节数组而不是单个字节)

There's no way to do it directly.没有办法直接做到这一点。 The smallest unit computers can handle is a byte (even booleans take up a byte).计算机可以处理的最小单位是一个字节(即使是布尔值也占用一个字节)。 However you can create a custom stream class that packs a byte with the bits you want then writes it.但是,您可以创建一个自定义流类,用您想要的位打包一个字节然后写入它。 You can then make a wrapper for this class who's write function takes some integral type, checks that it's between 0 and 7 (or -4 and 3 ... or whatever), extracts the bits in the same way the BitInputStream class (below) does, and makes the corresponding calls to the BitOutputStream's write method.然后你可以为这个类制作一个包装器,它的 write 函数采用一些整数类型,检查它是否在 0 和 7(或 -4 和 3 ......或其他)之间,以与 BitInputStream 类相同的方式提取位(下面)会,并对 BitOutputStream 的 write 方法进行相应的调用。 You might be thinking that you could just make one set of IO stream classes, but 3 doesn't go into 8 evenly.您可能会认为您可以只制作一组 IO 流类,但 3 不会均匀地进入 8。 So if you want optimum storage efficiency and you don't want to work really hard you're kind of stuck with two layers of abstraction.因此,如果您想要最佳存储效率并且不想真正努力工作,那么您就会陷入两层抽象的困境。 Below is a BitOutputStream class, a corresponding BitInputStream class, and a program that makes sure they work.下面是一个 BitOutputStream 类、一个相应的 BitInputStream 类和一个确保它们工作的程序。

import java.io.IOException;
import java.io.OutputStream;

class BitOutputStream {

    private OutputStream out;
    private boolean[] buffer = new boolean[8];
    private int count = 0;

    public BitOutputStream(OutputStream out) {
        this.out = out;
    }

    public void write(boolean x) throws IOException {
        this.count++;
        this.buffer[8-this.count] = x;
        if (this.count == 8){
            int num = 0;
            for (int index = 0; index < 8; index++){
                num = 2*num + (this.buffer[index] ? 1 : 0);
            }

            this.out.write(num - 128);

            this.count = 0;
        }
    }

    public void close() throws IOException {
        int num = 0;
        for (int index = 0; index < 8; index++){
            num = 2*num + (this.buffer[index] ? 1 : 0);
        }

        this.out.write(num - 128);

        this.out.close();
    }

}

I'm sure there's a way to pack the int with bit-wise operators and thus avoid having to reverse the input, but I don't what to think that hard.我确信有一种方法可以用按位运算符打包 int 从而避免必须反转输入,但我不认为很难。

Also, you probably noticed that there is no local way to detect that the last bit has been read in this implementation, but I really don't want to think that hard.此外,您可能会注意到,有检测到最后一位在此实现读取本地没有办法,但我真的不想觉得辛苦。

import java.io.IOException;
import java.io.InputStream;

class BitInputStream {

    private InputStream in;
    private int num = 0;
    private int count = 8;

    public BitInputStream(InputStream in) {
        this.in = in;
    }

    public boolean read() throws IOException {
        if (this.count == 8){
            this.num = this.in.read() + 128;
            this.count = 0;
        }

        boolean x = (num%2 == 1);
        num /= 2;
        this.count++;

        return x;
    }

    public void close() throws IOException {
        this.in.close();
    }

}

You probably know this, but you should put a BufferedStream in between your BitStream and FileStream or it'll take forever.您可能知道这一点,但是您应该在 BitStream 和 FileStream 之间放置一个 BufferedStream 否则它将永远花费。

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Random;

class Test {

    private static final int n = 1000000;

    public static void main(String[] args) throws IOException {

        Random random = new Random();

        //Generate array

        long startTime = System.nanoTime();

        boolean[] outputArray = new boolean[n];
        for (int index = 0; index < n; index++){
            outputArray[index] = random.nextBoolean();
        }

        System.out.println("Array generated in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Write to file

        startTime = System.nanoTime();

        BitOutputStream fout = new BitOutputStream(new BufferedOutputStream(new FileOutputStream("booleans.bin")));

        for (int index = 0; index < n; index++){
            fout.write(outputArray[index]);
        }

        fout.close();

        System.out.println("Array written to file in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Read from file

        startTime = System.nanoTime();

        BitInputStream fin = new BitInputStream(new BufferedInputStream(new FileInputStream("booleans.bin")));

        boolean[] inputArray = new boolean[n];
        for (int index = 0; index < n; index++){
            inputArray[index] = fin.read();
        }

        fin.close();

        System.out.println("Array read from file in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Delete file
        new File("booleans.bin").delete();

        //Check equality

        boolean equal = true;
        for (int index = 0; index < n; index++){
            if (outputArray[index] != inputArray[index]){
                equal = false;
                break;
            }
        }

        System.out.println("Input " + (equal ? "equals " : "doesn't equal ") + "output.");
    }

}

Please take a look at my bit-io library https://github.com/jinahya/bit-io , which can read and write non-octet-aligned values such as a 1-bit boolean or 17-bit unsigned integer.请查看我的 bit-io 库https://github.com/jinahya/bit-io ,它可以读取和写入非八位字节对齐的值,例如 1 位布尔值或 17 位无符号整数。

<dependency>
  <!-- resides in central repo -->
  <groupId>com.googlecode.jinahya</groupId>
  <artifactId>bit-io</artifactId>
  <version>1.0-alpha-13</version>
</dependency>

This library reads and writes arbitrary-length bits.该库读取和写入任意长度的位。

final InputStream stream;
final BitInput input = new BitInput(new BitInput.StreamInput(stream));

final int b = input.readBoolean(); // reads a 1-bit boolean value
final int i = input.readUnsignedInt(3); // reads a 3-bit unsigned int
final long l = input.readLong(47); // reads a 47-bit signed long

input.align(1); // 8-bit byte align; padding


final WritableByteChannel channel;
final BitOutput output = new BitOutput(new BitOutput.ChannelOutput(channel));

output.writeBoolean(true); // writes a 1-bit boolean value
output.writeInt(17, 0x00); // writes a 17-bit signed int
output.writeUnsignedLong(54, 0x00L); // writes a 54-bit unsigned long

output.align(4); // 32-bit byte align; discarding

InputStreams and OutputStreams are streams of bytes. InputStreams 和 OutputStreams 是字节流。

To read a bit you'll need to read a byte and then use bit manipulation to inspect the bits you care about.要读取位,您需要读取一个字节,然后使用位操作来检查您关心的位。 Likewise, to write bits you'll need to write bytes containing the bits you want.同样,要写入位,您需要写入包含所需位的字节。

Yes and no.是和否。 On most modern computers, a byte is the smallest addressable unit of memory, so you can only read/write entire bytes at a time.在大多数现代计算机上,字节是内存的最小可寻址单位,因此一次只能读/写整个字节。 However, you can always use bitwise operators to manipulate the bits within a byte.但是,您始终可以使用按位运算符来操作字节中的位。

If you are just writing bits to a file, Java'sBitSet class might be worth a look at.如果您只是将位写入文件,Java 的BitSet 类可能值得一看。 From the javadoc:从javadoc:

This class implements a vector of bits that grows as needed.这个类实现了一个根据需要增长的位向量。 Each component of the bit set has a boolean value.位集的每个组件都有一个布尔值。 The bits of a BitSet are indexed by nonnegative integers. BitSet 的位由非负整数索引。 Individual indexed bits can be examined, set, or cleared.可以检查、设置或清除单个索引位。 One BitSet may be used to modify the contents of another BitSet through logical AND, logical inclusive OR, and logical exclusive OR operations.一个 BitSet 可用于通过逻辑 AND、逻辑包含 OR 和逻辑异或运算来修改另一个 BitSet 的内容。

You are able to convert BitSets to long[] and byte[] to save data to a file.您可以将 BitSet 转换为 long[] 和 byte[] 以将数据保存到文件中。

Bits are packaged in bytes and apart from VHDL/Verilog I have seen no language that allows you to append individual bits to a stream.位以字节为单位打包,除了 VHDL/Verilog 之外,我还没有看到任何语言可以让您将单个位附加到流中。 Cache up your bits and pack them into a byte for a write using a buffer and bitmasking .缓存您的位并将它们打包成一个字节以使用缓冲区和位掩码进行写入。 Do the reverse for read, ie keep a pointer in your buffer and increment it as you return individually masked bits.对读取执行相反的操作,即在缓冲区中保留一个指针,并在返回单独的屏蔽位时递增它。

Afaik there is no function for doing this in the Java API. Afaik 在 Java API 中没有执行此操作的功能。 However you can of course read a byte and then use bit manipulation functions.但是,您当然可以读取一个字节,然后使用位操作函数。 Same goes for writing.写作也是一样。

The below code should work下面的代码应该工作

    int[] mynumbers = {3,4};
    BitSet compressedNumbers = new BitSet(mynumbers.length*3);
    // let's say you encoded 3 as 101 and 4 as 010
    String myNumbersAsBinaryString = "101010"; 
    for (int i = 0; i < myNumbersAsBinaryString.length(); i++) {
        if(myNumbersAsBinaryString.charAt(i) == '1')
            compressedNumbers.set(i);
    }
    String path = Resources.getResource("myfile.out").getPath();
    ObjectOutputStream outputStream = null;
    try {
        outputStream = new ObjectOutputStream(new FileOutputStream(path));
        outputStream.writeObject(compressedNumbers);
    } catch (IOException e) {
        e.printStackTrace();
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM