简体   繁体   English

将BitSet写入java中的文件

[英]writing a BitSet to a file in java

I have a BitSet and want to write it to a file- I came across a solution to use a ObjectOutputStream using the writeObject method. 我有一个BitSet并希望将其写入文件 - 我遇到了使用writeObject方法使用ObjectOutputStream的解决方案。

I looked at the ObjectOutputStream in the java API and saw that you can write other things (byte, int, short etc) 我查看了java API中的ObjectOutputStream,看到你可以编写其他东西(byte,int,short等)

I tried to check out the class so I tried to write a byte to a file using the following code but the result gives me a file with 7 bytes instead of 1 byte 我试着检查一下这个类,所以我尝试使用以下代码将一个字节写入文件,但结果给了我一个7字节而不是1字节的文件

my question is what are the first 6 bytes in the file? 我的问题是文件中的前6个字节是什么? why are they there? 他们为什么在那里?

my question is relevant to a BitSet because i don't want to start writing lots of data to a file and realize I have random bytes inserted in the file without knowing what they are. 我的问题与BitSet有关,因为我不想开始将大量数据写入文件,并意识到我在文件中插入了随机字节而不知道它们是什么。

here is the code: 这是代码:

    byte[] bt = new byte[]{'A'};
    File outFile = new File("testOut.txt");
    FileOutputStream fos = new FileOutputStream(outFile);
    ObjectOutputStream oos = new ObjectOutputStream(fos);
    oos.write(bt);
    oos.close();

thanks for any help 谢谢你的帮助

Avner 阿夫纳

The other bytes will be type information. 其他字节将是类型信息。

Basically ObjectOutputStream is a class used to write Serializable objects to some destination (usually a file). 基本上,ObjectOutputStream是一个用于将Serializable对象写入某个目标(通常是文件)的类。 It makes more sense if you think about InputObjectStream. 如果你考虑InputObjectStream,它会更有意义。 It has a readObject() method on it. 它上面有一个readObject()方法。 How does Java know what Object to instantiate? Java如何知道要实例化的Object? Easy: there is type information in there. 简单:那里有类型信息。

You could be writing any objects out to an ObjectOutputStream , so the stream holds information about the types written as well as the data needed to reconstitute the object. 您可以将任何对象写入ObjectOutputStream ,因此该流包含有关所写类型的信息以及重构该对象所需的数据。

If you know that the stream will always contain a BitSet, don't use an ObjectOutputStream - and if space is a premium, then convert the BitSet to a set of bytes where each bit corresponds to a bit in the BitSet , then write that directly to the underlying stream (eg a FileOutputStream as in your example). 如果您知道流将始终包含BitSet,请不要使用ObjectOutputStream - 如果空间是溢价,则将BitSet转换为一组字节,其中每个位对应于BitSet一个位,然后直接写入到底层流(例如,在您的示例中为FileOutputStream )。

The serialisation format, like many others, includes a header with magic number and version information. 与许多其他格式一样,序列化格式包括带有幻数和版本信息的标题。 When you use DataOutput / OutputStream methods on ObjectOutputStream are placed in the middle of the serialised data ( with no type information ). 当您使用DataOutput / OutputStream方法时, ObjectOutputStream被放置在序列化数据的中间( 没有类型信息 )。 This is typically only done in writeObject implementations after a call to defaultWriteObject or use of putFields . 这通常仅在调用defaultWriteObject或使用putFields之后在writeObject实现中putFields

If you only use the saved BitSet in Java, the serialization works fine. 如果您只使用Java中保存的BitSet,序列化工作正常。 However, it's kind of annoying if you want share the bitset across multi platforms. 但是,如果你想在多平台上共享bitset,那就太烦人了。 Besides the overhead of Java serialization, the BitSet is stored in units of 8-bytes. 除了Java序列化的开销之外,BitSet以8字节为单位存储。 This can generate too much overhead if your bitset is small. 如果你的bitset很小,这会产生太多的开销。

We wrote this small class so we can exract byte arrays from BitSet. 我们编写了这个小类,因此我们可以从BitSet中提取字节数组。 Depending on your usecase, it might work better than Java serialization for you. 根据您的用例,它可能比Java序列化更好。

public class ExportableBitSet extends BitSet {

    private static final long serialVersionUID = 1L;

    public ExportableBitSet() {
        super();
    }

    public ExportableBitSet(int nbits) {
        super(nbits);
    }

    public ExportableBitSet(byte[] bytes) {
        this(bytes == null? 0 : bytes.length*8);        
        for (int i = 0; i < size(); i++) {
            if (isBitOn(i, bytes))
                set(i);
        }
    }

    public byte[] toByteArray()  {

        if (size() == 0)
            return new byte[0];

        // Find highest bit
        int hiBit = -1;
        for (int i = 0; i < size(); i++)  {
            if (get(i))
                hiBit = i;
        }

        int n = (hiBit + 8) / 8;
        byte[] bytes = new byte[n];
        if (n == 0)
            return bytes;

        Arrays.fill(bytes, (byte)0);
        for (int i=0; i<n*8; i++) {
            if (get(i)) 
                setBit(i, bytes);
        }

        return bytes;
    }

    protected static int BIT_MASK[] = 
        {0x80, 0x40, 0x20, 0x10, 0x08, 0x04, 0x02, 0x01};

    protected static boolean isBitOn(int bit, byte[] bytes) {
        int size = bytes == null ? 0 : bytes.length*8;

        if (bit >= size) 
            return false;

        return (bytes[bit/8] & BIT_MASK[bit%8]) != 0;
    }

    protected static void setBit(int bit, byte[] bytes) {
        int size = bytes == null ? 0 : bytes.length*8;

        if (bit >= size) 
            throw new ArrayIndexOutOfBoundsException("Byte array too small");

        bytes[bit/8] |= BIT_MASK[bit%8];
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM