简体   繁体   English

在bool中设置额外位会使其同时为true和false

[英]Setting extra bits in a bool makes it true and false at the same time

If I get a bool variable and set its second bit to 1, then variable evaluates to true and false at the same time. 如果我得到一个bool变量并将其第二位设置为1,则变量同时计算为true和false。 Compile the following code with gcc6.3 with -g option, ( gcc-v6.3.0/Linux/RHEL6.0-2016-x86_64/bin/g++ -g main.cpp -o mytest_d ) and run the executable. 使用带-g选项的gcc6.3编译以下代码( gcc-v6.3.0/Linux/RHEL6.0-2016-x86_64/bin/g++ -g main.cpp -o mytest_d )并运行可执行文件。 You get the following. 你得到以下。

How can T be equal to true and false at the same time? T如何同时等于真和假?

       value   bits 
       -----   ---- 
    T:   1     0001
after bit change
    T:   3     0011
T is true
T is false

This can happen when you call a function in a different language (say fortran) where true and false definition is different than C++. 当您使用不同语言(例如fortran)调用函数时,可能会发生这种情况,其中true和false定义与C ++不同。 For fortran if any bits are not 0 then the value is true, if all bits are zero then the value is false. 对于fortran,如果任何位不为0,则该值为true,如果所有位均为零,则该值为false。

#include <iostream>
#include <bitset>

using namespace std;

void set_bits_to_1(void* val){
  char *x = static_cast<char *>(val);

  for (int i = 0; i<2; i++ ){
    *x |= (1UL << i);
  }
}

int main(int argc,char *argv[])
{

  bool T = 3;

  cout <<"       value   bits " <<endl;
  cout <<"       -----   ---- " <<endl;
  cout <<"    T:   "<< T <<"     "<< bitset<4>(T)<<endl;

  set_bits_to_1(&T);


  bitset<4> bit_T = bitset<4>(T);
  cout <<"after bit change"<<endl;
  cout <<"    T:   "<< T <<"     "<< bit_T<<endl;

  if (T ){
    cout <<"T is true" <<endl;
  }

  if ( T == false){
    cout <<"T is false" <<endl;
  }


}

/////////////////////////////////// // Fortran function that is not compatible with C++ when compiled with ifort. /////////////////////////////////// //使用ifort编译时与C ++不兼容的Fortran函数。

       logical*1 function return_true()
         implicit none

         return_true = 1;

       end function return_true

In C++ the bit representation (and even the size) of a bool is implementation defined; 在C ++中, bool的位表示(甚至大小)是实现定义的; generally it's implemented as a char -sized type taking 1 or 0 as possible values. 通常它被实现为char -sized类型,取1或0作为可能的值。

If you set its value to anything different from the allowed ones (in this specific case by aliasing a bool through a char and modifying its bit representation), you are breaking the rules of the language, so anything can happen. 如果将其值设置为与允许值不同的任何值(在此特定情况下,通过charbool别名化并修改其位表示),则会破坏语言规则,因此任何事情都可能发生。 In particular, it's explicitly specified in the standard that a "broken" bool may behave as both true and false (or neither true nor false ) at the same time: 特别是,在标准中明确规定,“破坏”的bool可能同时表现为truefalse (或既不是true也不是false ):

Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false 以本国际标准描述的方式将bool值用作“未定义”,例如通过检查未初始化的自动对象的值,可能会使其表现为既不是true也不是false

(C++11, [basic.fundamental], note 47) (C ++ 11,[basic.fundamental],注47)


In this particular case, you can see how it ended up in this bizarre situation : the first if gets compiled to 在这种特殊情况下, 你可以看到它在这种奇怪的情况下是如何结束的 :第一个if被编译到

    movzx   eax, BYTE PTR [rbp-33]
    test    al, al
    je      .L22

which loads T in eax (with zero extension), and skips the print if it's all zero; 它在eax中加载T (零扩展),如果全部为零,则跳过打印; the next if instead is 相反,下一个是

    movzx   eax, BYTE PTR [rbp-33]
    xor     eax, 1
    test    al, al
    je      .L23

The test if(T == false) is transformed to if(T^1) , which flips just the low bit. 测试if(T == false)被转换为if(T^1) ,它只翻转低位。 This would be ok for a valid bool , but for your "broken" one it doesn't cut it. 这对于有效的bool来说是好的,但是对于你的“破碎”它来说它不会削减它。

Notice that this bizarre sequence is only generated at low optimization levels; 请注意,这个奇怪的序列仅在低优化级别生成; at higher levels this is generally going to boil down to a zero/nonzero check, and a sequence like yours is likely to become a single test/conditional branch . 在较高级别,这通常会归结为零/非零检查,并且像您这样的序列可能会成为单个测试/条件分支 You will get bizarre behavior anyway in other contexts, eg when summing bool values to other integers: 无论如何,在其他情况下你会得到奇怪的行为,例如将bool值与其他整数相加时:

int foo(bool b, int i) {
    return i + b;
}

becomes

foo(bool, int):
        movzx   edi, dil
        lea     eax, [rdi+rsi]
        ret

where dil is "trusted" to be 0/1. 其中dil被“信任”为0/1。


If your program is all C++, then the solution is simple: don't break bool values this way, avoid messing with their bit representation and everything will go well; 如果你的程序都是C ++,那么解决方案很简单:不要以这种方式破坏bool值,避免弄乱它们的位表示,一切都会顺利; in particular, even if you assign from an integer to a bool the compiler will emit the necessary code to make sure that the resulting value is a valid bool , so your bool T = 3 is indeed safe, and T will end up with a true in its guts. 特别是,即使你从一个整数分配给一个bool ,编译器也会发出必要的代码以确保结果值是一个有效的bool ,所以你的bool T = 3确实是安全的,而T最终会得到一个true在它的胆量。

If instead you need to interoperate with code written in other languages that may not share the same idea of what a bool is, just avoid bool for "boundary" code, and marshal it as an appropriately-sized integer. 相反,如果你需要与其他语言编写的代码进行互操作,这些代码可能不同于bool的相同概念,只需避免bool代表“边界”代码,并将其编组为适当大小的整数。 It will work in conditionals & co. 它将在条件和合作。 just as fine. 同样好。


Update about the Fortran/interoperability side of the issue 有关该问题的Fortran /互操作性方面的更新

Disclaimer all I know of Fortran is what I read this morning on standard documents, and that I have some punched cards with Fortran listings that I use as bookmarks, so go easy on me. 免责声明我所知道的Fortran就是我今天早上在标准文档上看到的内容,而且我有一些用Fortran列表打孔的卡片,我用作书签,所以请放轻松。

First of all, this kind of language interoperability stuff isn't part of the language standards, but of the platform ABI. 首先,这种语言互操作性的东西不是语言标准的一部分,而是ABI平台的一部分。 As we are talking about Linux x86-64, the relevant document is the System V x86-64 ABI . 在我们讨论Linux x86-64时,相关文档是System V x86-64 ABI

First of all, nowhere is specified that the C _Bool type (which is defined to be the same as C++ bool at 3.1.2 note †) has any kind of compatibility with Fortran LOGICAL ; 首先,没有指定C _Bool类型(在3.1.2注意†中定义为与C ++ bool相同)与Fortran LOGICAL有任何兼容性; in particular, at 9.2.2 table 9.2 specifies that "plain" LOGICAL is mapped to signed int . 特别是在9.2.2表9.2中指定将“plain” LOGICAL映射到signed int About TYPE*N types it says that 关于TYPE*N类型,它说

The “ TYPE*N ” notation specifies that variables or aggregate members of type TYPE shall occupy N bytes of storage. 了“ TYPE*N N ”表示法指定了变量或类型的骨料成员TYPE应占据N存储的字节。

(ibid.) (同上)

There's no equivalent type explicitly specified for LOGICAL*1 , and it's understandable: it's not even standard; 没有为LOGICAL*1明确指定的等效类型,这是可以理解的:它甚至不是标准的; indeed if you try to compile a Fortran program containing a LOGICAL*1 in Fortran 95 compliant mode you get warnings about it, both by ifort 事实上,如果您尝试在Fortran 95兼容模式下编译包含LOGICAL*1的Fortran程序,您会收到有关它的警告

./example.f90(2): warning #6916: Fortran 95 does not allow this length specification.   [1]

    logical*1, intent(in) :: x

------------^

and by gfort 并且由gfort

./example.f90:2:13:
     logical*1, intent(in) :: x
             1
Error: GNU Extension: Nonstandard type declaration LOGICAL*1 at (1)

so the waters are already muddled; 所以水已经糊里糊涂了; so, combining the two rules above, I'd go for signed char to be safe. 所以,结合上面的两个规则,我会选择signed char是安全的。

However : the ABI also specifies: 但是 :ABI还指定:

The values for type LOGICAL are .TRUE. LOGICAL类型的值为.TRUE. implemented as 1 and .FALSE. 实现为1和.FALSE. implemented as 0. 实现为0。

So, if you have a program that stores anything besides 1 and 0 in a LOGICAL value, you are already out of spec on the Fortran side ! 所以,如果你有一个程序在LOGICAL值中存储除1和0之外的任何东西, 那么你已经超出了Fortran方面的规范 You say: 你说:

A fortran logical*1 has same representation as bool , but in fortran if bits are 00000011 it is true , in C++ it is undefined. fortran logical*1具有与bool相同的表示,但是如果位是00000011则在fortran中是true ,在C ++中它是未定义的。

This last statement is not true, the Fortran standard is representation-agnostic, and the ABI explicitly says the contrary. 最后的陈述不正确,Fortran标准是表示不可知的,而ABI明确地说相反。 Indeed you can see this in action easily by checking the output of gfort for LOGICAL comparison : 事实上,通过检查gfort的输出以进行LOGICAL比较,您可以轻松地看到这一点:

integer function logical_compare(x, y)
    logical, intent(in) :: x
    logical, intent(in) :: y
    if (x .eqv. y) then
        logical_compare = 12
    else
        logical_compare = 24
    end if
end function logical_compare

becomes

logical_compare_:
        mov     eax, DWORD PTR [rsi]
        mov     edx, 24
        cmp     DWORD PTR [rdi], eax
        mov     eax, 12
        cmovne  eax, edx
        ret

You'll notice that there's a straight cmp between the two values, without normalizing them first (unlike ifort , that is more conservative in this regard). 您会注意到两个值之间存在直接的cmp ,而不是先将它们标准化(与ifort不同,在这方面更为保守)。

Even more interesting: regardless of what the ABI says, ifort by default uses a nonstandard representation for LOGICAL ; 更有趣的是:无论ABI说什么,ifort默认使用LOGICAL的非标准表示; this is explained in the -fpscomp logicals switch documentation, which also specifies some interesting details about LOGICAL and cross-language compatibility: 这在-fpscomp logicals交换机文档中进行了解释,该文档还指定了有关LOGICAL和跨语言兼容性的一些有趣细节:

Specifies that integers with a non-zero value are treated as true, integers with a zero value are treated as false. 指定具有非零值的整数被视为true,具有零值的整数被视为false。 The literal constant .TRUE. 文字常量.TRUE。 has an integer value of 1, and the literal constant .FALSE. 整数值为1,文字常量为FALSE。 has an integer value of 0. This representation is used by Intel Fortran releases before Version 8.0 and by Fortran PowerStation. 整数值为0.此表示形式由版本8.0之前的英特尔Fortran版本和Fortran PowerStation使用。

The default is fpscomp nologicals , which specifies that odd integer values (low bit one) are treated as true and even integer values (low bit zero) are treated as false. 默认值为fpscomp nologicals ,它指定奇数值(低位1)被视为true,偶数整数值(低位0)被视为false。

The literal constant .TRUE. 文字常量.TRUE。 has an integer value of -1, and the literal constant .FALSE. 整数值为-1,文字常量为.FALSE。 has an integer value of 0. This representation is used by Compaq Visual Fortran. 整数值为0. Compaq Visual Fortran使用此表示形式。 The internal representation of LOGICAL values is not specified by the Fortran standard. Fortran标准未指定LOGICAL值的内部表示。 Programs which use integer values in LOGICAL contexts, or which pass LOGICAL values to procedures written in other languages, are non-portable and may not execute correctly. 在LOGICAL上下文中使用整数值或将LOGICAL值传递给用其他语言编写的过程的程序是不可移植的,可能无法正确执行。 Intel recommends that you avoid coding practices that depend on the internal representation of LOGICAL values. 英特尔建议您避免使用依赖于LOGICAL值内部表示的编码实践。

(emphasis added) (重点补充)

Now, the internal representation of a LOGICAL normally shouldn't a problem, as, from what I gather, if you play "by the rules" and don't cross language boundaries you aren't going to notice. 现在, LOGICAL的内部表示通常不应该成为问题,因为从我收集的内容来看,如果你按照规则进行游戏并且不跨越语言边界,你就不会注意到。 For a standard compliant program there's no "straight conversion" between INTEGER and LOGICAL ; 对于符合标准的程序, INTEGERLOGICAL之间没有“直接转换”; the only way I see you can shove an INTEGER into a LOGICAL seem to be TRANSFER , which is intrinsically non-portable and give no real guarantees, or the non-standard INTEGER <-> LOGICAL conversion on assignment. 我认为你可以将INTEGER推入LOGICAL的唯一方法似乎是TRANSFER ,它本质上是不可移植的,没有真正的保证,或者在分配时没有非标准的INTEGER < - > LOGICAL转换。

The latter one is documented by gfort to always result in nonzero -> .TRUE. 后者是记录由gfort到总是导致非零- > .TRUE. , zero -> .FALSE. ,零 - > .FALSE. , and you can see that in all cases code is generated to make this happen (even though it's convoluted code in case of ifort with the legacy representation), so you cannot seem to shove an arbitrary integer into a LOGICAL in this way. 你可以看到 ,在所有情况下生成的代码都是为了实现这一点(即使在带有遗留表示的ifort的情况下它是复杂的代码),所以你似乎无法以这种方式将任意整数推送到LOGICAL中。

logical*1 function integer_to_logical(x)
    integer, intent(in) :: x
    integer_to_logical = x
    return
end function integer_to_logical
integer_to_logical_:
        mov     eax, DWORD PTR [rdi]
        test    eax, eax
        setne   al
        ret

The reverse conversion for a LOGICAL*1 is a straight integer zero-extension (gfort), so, to be honoring the contract in the documentation linked above, it's clearly expecting the LOGICAL value to be 0 or 1. LOGICAL*1的反向转换是直的整数零扩展(gfort),因此,为了遵守上面链接的文档中的合同,显然期望LOGICAL值为0或1。

But in general, the situation for these conversions is a bit of a mess , so I'd just stay away from them. 但总的来说,这些转换的情况有点 混乱 ,所以我只是远离它们。


So, long story short: avoid putting INTEGER data into LOGICAL values, as it is bad even in Fortran, and make sure to use the correct compiler flag to get the ABI-compliant representation for booleans, and interoperability with C/C++ should be fine. 所以,长话短说:避免将INTEGER数据放入LOGICAL值,因为即使在Fortran中它也很糟糕,并确保使用正确的编译器标志来获得布尔值的ABI兼容表示,并且与C / C ++的互操作性应该没问题。 But to be extra safe, I'd just use plain char on the C++ side. 但为了更安全,我只是在C ++方面使用plain char

Finally, from what I gather from the documentation , in ifort there is some builtin support for interoperability with C, including booleans; 最后,根据我从文档中收集的内容 ,在ifort中有一些内置支持与C的互操作性,包括布尔值; you may try to leverage it. 你可以尝试利用它。

This is what happens when you violate your contract with both the language and the compiler. 当您违反与语言和编译器的合同时会发生这种情况。

You probably heard somewhere that "zero is false", and "non-zero is true". 你可能听说过“零是假的”,“非零是真的”。 That holds when you stick to the language's parameters, statically converting an int to bool or vice versa. 当你坚持语言的参数,静态地将int转换为bool或反之亦然时,这就成立了。

It does not hold when you start messing with bit representations. 当你开始搞乱位表示时它不成立。 In that case, you break your contract, and enter the realm of (at the very least) implementation-defined behaviour. 在这种情况下,您违反合同,并进入(至少)实现定义的行为领域。

Simply don't do that. 根本不要那样做。

It's not up to you how a bool is stored in memory. 这不取决于bool如何存储在内存中。 It's up to the compiler. 这取决于编译器。 If you want to change a bool 's value, either assign true / false , or assign an integer and use the proper conversion mechanisms provided by C++. 如果要更改bool的值,请指定true / false ,或者指定一个整数并使用C ++提供的正确转换机制。


The C++ standard used to actually give a specific call-out to how using bool in this manner is naughty and bad and evil ( "Using a bool value in ways described by this document as 'undefined',such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false ." ), though it was removed in C++20 for editorial reasons . C ++标准实际上给出了一个特定的调用,告诉我们如何以这种方式使用bool是顽皮的,坏的和邪恶的( “使用bool值以本文档描述的方式为'undefined',例如通过检查一个值未初始化的自动对象,可能会使它表现得好像既不是true也不是false 。“ ),尽管出于编辑原因它已在C ++ 20中删除

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM