简体   繁体   中英

How can I find the size of an enum while ignoring the discriminant?

The Rust Reference documents that a Rust enum annotated with #[repr(C)] can be viewed as a C struct of two fields. The first field is a C enum for the discriminant, the second field is a C union of C structs corresponding to the fields of the enum's variants.

Due to a bug in an FFI interoperation library, I need to avoid using unions that are exactly 8 bytes. To that end, I wanted to add some static assertions to my Rust code so I would be aware of any problematic enums. I do not know how to ask the compiler for the size of the generated union type (or equivalently, the size of the enum without accounting for the discriminant):

#[repr(C)]
enum UnionSizeIs8Bytes {
    A(u8),
    B(u64),
}

#[repr(C)]
enum UnionSizeIsNot8Bytes {
    A(u8),
    B(u16),
}

const _: () = {
    // Should fail, but does not
    assert!(8 != std::mem::size_of::<UnionSizeIs8Bytes>());

    // Should not fail, but does
    assert!(8 != std::mem::size_of::<UnionSizeIsNot8Bytes>());
};

Reading The Book about repr(C) field-less enums :

[...] the C representation has the size and alignment of the default enum size and alignment for the target platform's C ABI.

That is, they try to be fully compatible with C enums.

And in the next section about struct-like enums :

[..] is a repr(C) struct with two fields:

  • a repr(C) version of the enum with all fields removed ("the tag")
  • a repr(C) union of repr(C) structs for the fields of each variant that had them ("the payload")

That is, your enum:

#[repr(C)]
enum UnionSizeIs8Bytes {
    A(u8),
    B(u64),
}

has the same layout as this other one:

#[repr(C)]
enum UnionSizeIs8Bytes_Tag {
    A,
    B,
}
#[repr(C)]
union UnionSizeIs8Bytes_Union {
   a: u8,
   b: u64,
}
#[repr(C)]
struct UnionSizeIs8Bytes_Explicit {
    tag: UnionSizeIs8Bytes_Tag,
    data: UnionSizeIs8Bytes_Union,
}

Now, what is the actual size and alignment of an enum in C? It seems that even experts do not fully agree in the details. In practice most mainstream C compilers define the underlying type of an enum as a plain int , that will be an i32 or u32 .

With that in mind, the layout of your examples should be straightforward:

  • UnionSizeIs8Bytes :

    • 0-4: tag
    • 4-8: padding
    • 8-16: union
      • 8-9: u8
      • 8-16: u64
    • Size: 16, alignment: 8
  • UnionSizeIsNot8Bytes :

    • 0-4: tag
    • 4-6: union:
      • 4-5: u8
      • 4-6: u16
    • 6-8: padding
    • Size: 8, alignment: 4

Note that the alignment of a repr(C) enum is never less than that of the tag, that is 4 bytes using the above assumptions.

To compute the size of the data without the tag, you just have to subtract to the full size the value of the alignment. The alignment value will account for the size of the tag itself plus any needed padding.

const fn size_of_enum_data<T>() -> usize {
    std::mem::size_of::<T>() - std::mem::align_of::<T>()
}

If you want to be extra sure you could subtract std::mem::align_of::<T>().max(std::mem::size_of::<i32>()) , in case your architecture's i32 does not have alignment equal to 4, but unfortunately max doesn't seem to be const yet. You could write an if of course, but that gets ugly, something like:

const fn size_of_enum_data<T>() -> usize {
    let a = std::mem::align_of::<T>();
    let i = std::mem::size_of::<i32>();
    std::mem::size_of::<T>() - if a > i { a }  else { i }
}

And if you want to be extra, extra sure, you can use c_int instead of i32 . But then for esoteric architectures where c_int != i32 maybe the C enum equals C int may not hold either...

Then your assertions would be (playground) :

const _: () = {
    // It fails
    assert!(8 != size_of_enum_data::<UnionSizeIs8Bytes>());

    // It does not fail
    assert!(8 != size_of_enum_data::<UnionSizeIsNot8Bytes>());
};

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM