[英]Setting extra bits in a bool makes it true and false at the same time
If I get a bool
variable and set its second bit to 1, then variable evaluates to true and false at the same time. 如果我得到一个
bool
变量并将其第二位设置为1,则变量同时计算为true和false。 Compile the following code with gcc6.3 with -g
option, ( gcc-v6.3.0/Linux/RHEL6.0-2016-x86_64/bin/g++ -g main.cpp -o mytest_d
) and run the executable. 使用带
-g
选项的gcc6.3编译以下代码( gcc-v6.3.0/Linux/RHEL6.0-2016-x86_64/bin/g++ -g main.cpp -o mytest_d
)并运行可执行文件。 You get the following. 你得到以下。
How can T be equal to true and false at the same time? T如何同时等于真和假?
value bits
----- ----
T: 1 0001
after bit change
T: 3 0011
T is true
T is false
This can happen when you call a function in a different language (say fortran) where true and false definition is different than C++. 当您使用不同语言(例如fortran)调用函数时,可能会发生这种情况,其中true和false定义与C ++不同。 For fortran if any bits are not 0 then the value is true, if all bits are zero then the value is false.
对于fortran,如果任何位不为0,则该值为true,如果所有位均为零,则该值为false。
#include <iostream>
#include <bitset>
using namespace std;
void set_bits_to_1(void* val){
char *x = static_cast<char *>(val);
for (int i = 0; i<2; i++ ){
*x |= (1UL << i);
}
}
int main(int argc,char *argv[])
{
bool T = 3;
cout <<" value bits " <<endl;
cout <<" ----- ---- " <<endl;
cout <<" T: "<< T <<" "<< bitset<4>(T)<<endl;
set_bits_to_1(&T);
bitset<4> bit_T = bitset<4>(T);
cout <<"after bit change"<<endl;
cout <<" T: "<< T <<" "<< bit_T<<endl;
if (T ){
cout <<"T is true" <<endl;
}
if ( T == false){
cout <<"T is false" <<endl;
}
}
/////////////////////////////////// // Fortran function that is not compatible with C++ when compiled with ifort. /////////////////////////////////// //使用ifort编译时与C ++不兼容的Fortran函数。
logical*1 function return_true()
implicit none
return_true = 1;
end function return_true
In C++ the bit representation (and even the size) of a bool
is implementation defined; 在C ++中,
bool
的位表示(甚至大小)是实现定义的; generally it's implemented as a char
-sized type taking 1 or 0 as possible values. 通常它被实现为
char
-sized类型,取1或0作为可能的值。
If you set its value to anything different from the allowed ones (in this specific case by aliasing a bool
through a char
and modifying its bit representation), you are breaking the rules of the language, so anything can happen. 如果将其值设置为与允许值不同的任何值(在此特定情况下,通过
char
将bool
别名化并修改其位表示),则会破坏语言规则,因此任何事情都可能发生。 In particular, it's explicitly specified in the standard that a "broken" bool
may behave as both true
and false
(or neither true
nor false
) at the same time: 特别是,在标准中明确规定,“破坏”的
bool
可能同时表现为true
和false
(或既不是true
也不是false
):
Using a
bool
value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neithertrue
norfalse
以本国际标准描述的方式将
bool
值用作“未定义”,例如通过检查未初始化的自动对象的值,可能会使其表现为既不是true
也不是false
(C++11, [basic.fundamental], note 47) (C ++ 11,[basic.fundamental],注47)
In this particular case, you can see how it ended up in this bizarre situation : the first if
gets compiled to 在这种特殊情况下, 你可以看到它在这种奇怪的情况下是如何结束的 :第一个
if
被编译到
movzx eax, BYTE PTR [rbp-33]
test al, al
je .L22
which loads T
in eax
(with zero extension), and skips the print if it's all zero; 它在
eax
中加载T
(零扩展),如果全部为零,则跳过打印; the next if instead is 相反,下一个是
movzx eax, BYTE PTR [rbp-33]
xor eax, 1
test al, al
je .L23
The test if(T == false)
is transformed to if(T^1)
, which flips just the low bit. 测试
if(T == false)
被转换为if(T^1)
,它只翻转低位。 This would be ok for a valid bool
, but for your "broken" one it doesn't cut it. 这对于有效的
bool
来说是好的,但是对于你的“破碎”它来说它不会削减它。
Notice that this bizarre sequence is only generated at low optimization levels; 请注意,这个奇怪的序列仅在低优化级别生成; at higher levels this is generally going to boil down to a zero/nonzero check, and a sequence like yours is likely to become a single test/conditional branch .
在较高级别,这通常会归结为零/非零检查,并且像您这样的序列可能会成为单个测试/条件分支 。 You will get bizarre behavior anyway in other contexts, eg when summing
bool
values to other integers: 无论如何,在其他情况下你会得到奇怪的行为,例如将
bool
值与其他整数相加时:
int foo(bool b, int i) {
return i + b;
}
foo(bool, int):
movzx edi, dil
lea eax, [rdi+rsi]
ret
where dil
is "trusted" to be 0/1. 其中
dil
被“信任”为0/1。
If your program is all C++, then the solution is simple: don't break bool
values this way, avoid messing with their bit representation and everything will go well; 如果你的程序都是C ++,那么解决方案很简单:不要以这种方式破坏
bool
值,避免弄乱它们的位表示,一切都会顺利; in particular, even if you assign from an integer to a bool
the compiler will emit the necessary code to make sure that the resulting value is a valid bool
, so your bool T = 3
is indeed safe, and T
will end up with a true
in its guts. 特别是,即使你从一个整数分配给一个
bool
,编译器也会发出必要的代码以确保结果值是一个有效的bool
,所以你的bool T = 3
确实是安全的,而T
最终会得到一个true
在它的胆量。
If instead you need to interoperate with code written in other languages that may not share the same idea of what a bool
is, just avoid bool
for "boundary" code, and marshal it as an appropriately-sized integer. 相反,如果你需要与其他语言编写的代码进行互操作,这些代码可能不同于
bool
的相同概念,只需避免bool
代表“边界”代码,并将其编组为适当大小的整数。 It will work in conditionals & co. 它将在条件和合作。 just as fine.
同样好。
Disclaimer all I know of Fortran is what I read this morning on standard documents, and that I have some punched cards with Fortran listings that I use as bookmarks, so go easy on me.
免责声明我所知道的Fortran就是我今天早上在标准文档上看到的内容,而且我有一些用Fortran列表打孔的卡片,我用作书签,所以请放轻松。
First of all, this kind of language interoperability stuff isn't part of the language standards, but of the platform ABI. 首先,这种语言互操作性的东西不是语言标准的一部分,而是ABI平台的一部分。 As we are talking about Linux x86-64, the relevant document is the System V x86-64 ABI .
在我们讨论Linux x86-64时,相关文档是System V x86-64 ABI 。
First of all, nowhere is specified that the C _Bool
type (which is defined to be the same as C++ bool
at 3.1.2 note †) has any kind of compatibility with Fortran LOGICAL
; 首先,没有指定C
_Bool
类型(在3.1.2注意†中定义为与C ++ bool
相同)与Fortran LOGICAL
有任何兼容性; in particular, at 9.2.2 table 9.2 specifies that "plain" LOGICAL
is mapped to signed int
. 特别是在9.2.2表9.2中指定将“plain”
LOGICAL
映射到signed int
。 About TYPE*N
types it says that 关于
TYPE*N
类型,它说
The “
TYPE*N
” notation specifies that variables or aggregate members of typeTYPE
shall occupyN
bytes of storage.了“
TYPE*N
N ”表示法指定了变量或类型的骨料成员TYPE
应占据N
存储的字节。
(ibid.) (同上)
There's no equivalent type explicitly specified for LOGICAL*1
, and it's understandable: it's not even standard; 没有为
LOGICAL*1
明确指定的等效类型,这是可以理解的:它甚至不是标准的; indeed if you try to compile a Fortran program containing a LOGICAL*1
in Fortran 95 compliant mode you get warnings about it, both by ifort 事实上,如果您尝试在Fortran 95兼容模式下编译包含
LOGICAL*1
的Fortran程序,您会收到有关它的警告
./example.f90(2): warning #6916: Fortran 95 does not allow this length specification. [1]
logical*1, intent(in) :: x
------------^
and by gfort 并且由gfort
./example.f90:2:13:
logical*1, intent(in) :: x
1
Error: GNU Extension: Nonstandard type declaration LOGICAL*1 at (1)
so the waters are already muddled; 所以水已经糊里糊涂了; so, combining the two rules above, I'd go for
signed char
to be safe. 所以,结合上面的两个规则,我会选择
signed char
是安全的。
However : the ABI also specifies: 但是 :ABI还指定:
The values for type
LOGICAL
are.TRUE.
LOGICAL
类型的值为.TRUE.
implemented as 1 and.FALSE.
实现为1和
.FALSE.
implemented as 0.实现为0。
So, if you have a program that stores anything besides 1 and 0 in a LOGICAL
value, you are already out of spec on the Fortran side ! 所以,如果你有一个程序在
LOGICAL
值中存储除1和0之外的任何东西, 那么你已经超出了Fortran方面的规范 ! You say: 你说:
A fortran
logical*1
has same representation asbool
, but in fortran if bits are 00000011 it istrue
, in C++ it is undefined.fortran
logical*1
具有与bool
相同的表示,但是如果位是00000011则在fortran中是true
,在C ++中它是未定义的。
This last statement is not true, the Fortran standard is representation-agnostic, and the ABI explicitly says the contrary. 最后的陈述不正确,Fortran标准是表示不可知的,而ABI明确地说相反。 Indeed you can see this in action easily by checking the output of gfort for
LOGICAL
comparison : 事实上,通过检查gfort的输出以进行
LOGICAL
比较,您可以轻松地看到这一点:
integer function logical_compare(x, y)
logical, intent(in) :: x
logical, intent(in) :: y
if (x .eqv. y) then
logical_compare = 12
else
logical_compare = 24
end if
end function logical_compare
becomes 变
logical_compare_:
mov eax, DWORD PTR [rsi]
mov edx, 24
cmp DWORD PTR [rdi], eax
mov eax, 12
cmovne eax, edx
ret
You'll notice that there's a straight cmp
between the two values, without normalizing them first (unlike ifort
, that is more conservative in this regard). 您会注意到两个值之间存在直接的
cmp
,而不是先将它们标准化(与ifort
不同,在这方面更为保守)。
Even more interesting: regardless of what the ABI says, ifort by default uses a nonstandard representation for LOGICAL
; 更有趣的是:无论ABI说什么,ifort默认使用
LOGICAL
的非标准表示; this is explained in the -fpscomp logicals
switch documentation, which also specifies some interesting details about LOGICAL
and cross-language compatibility: 这在
-fpscomp logicals
交换机文档中进行了解释,该文档还指定了有关LOGICAL
和跨语言兼容性的一些有趣细节:
Specifies that integers with a non-zero value are treated as true, integers with a zero value are treated as false.
指定具有非零值的整数被视为true,具有零值的整数被视为false。 The literal constant .TRUE.
文字常量.TRUE。 has an integer value of 1, and the literal constant .FALSE.
整数值为1,文字常量为FALSE。 has an integer value of 0. This representation is used by Intel Fortran releases before Version 8.0 and by Fortran PowerStation.
整数值为0.此表示形式由版本8.0之前的英特尔Fortran版本和Fortran PowerStation使用。
The default is
fpscomp nologicals
, which specifies that odd integer values (low bit one) are treated as true and even integer values (low bit zero) are treated as false.默认值为
fpscomp nologicals
,它指定奇数值(低位1)被视为true,偶数整数值(低位0)被视为false。The literal constant .TRUE.
文字常量.TRUE。 has an integer value of -1, and the literal constant .FALSE.
整数值为-1,文字常量为.FALSE。 has an integer value of 0. This representation is used by Compaq Visual Fortran.
整数值为0. Compaq Visual Fortran使用此表示形式。 The internal representation of LOGICAL values is not specified by the Fortran standard.
Fortran标准未指定LOGICAL值的内部表示。 Programs which use integer values in LOGICAL contexts, or which pass LOGICAL values to procedures written in other languages, are non-portable and may not execute correctly.
在LOGICAL上下文中使用整数值或将LOGICAL值传递给用其他语言编写的过程的程序是不可移植的,可能无法正确执行。 Intel recommends that you avoid coding practices that depend on the internal representation of LOGICAL values.
英特尔建议您避免使用依赖于LOGICAL值内部表示的编码实践。
(emphasis added) (重点补充)
Now, the internal representation of a LOGICAL
normally shouldn't a problem, as, from what I gather, if you play "by the rules" and don't cross language boundaries you aren't going to notice. 现在,
LOGICAL
的内部表示通常不应该成为问题,因为从我收集的内容来看,如果你按照规则进行游戏并且不跨越语言边界,你就不会注意到。 For a standard compliant program there's no "straight conversion" between INTEGER
and LOGICAL
; 对于符合标准的程序,
INTEGER
和LOGICAL
之间没有“直接转换”; the only way I see you can shove an INTEGER
into a LOGICAL
seem to be TRANSFER
, which is intrinsically non-portable and give no real guarantees, or the non-standard INTEGER
<-> LOGICAL
conversion on assignment. 我认为你可以将
INTEGER
推入LOGICAL
的唯一方法似乎是TRANSFER
,它本质上是不可移植的,没有真正的保证,或者在分配时没有非标准的INTEGER
< - > LOGICAL
转换。
The latter one is documented by gfort to always result in nonzero -> .TRUE.
后者是记录由gfort到总是导致非零- >
.TRUE.
, zero -> .FALSE.
,零 - >
.FALSE.
, and you can see that in all cases code is generated to make this happen (even though it's convoluted code in case of ifort with the legacy representation), so you cannot seem to shove an arbitrary integer into a LOGICAL
in this way. , 你可以看到 ,在所有情况下生成的代码都是为了实现这一点(即使在带有遗留表示的ifort的情况下它是复杂的代码),所以你似乎无法以这种方式将任意整数推送到
LOGICAL
中。
logical*1 function integer_to_logical(x)
integer, intent(in) :: x
integer_to_logical = x
return
end function integer_to_logical
integer_to_logical_:
mov eax, DWORD PTR [rdi]
test eax, eax
setne al
ret
The reverse conversion for a LOGICAL*1
is a straight integer zero-extension (gfort), so, to be honoring the contract in the documentation linked above, it's clearly expecting the LOGICAL
value to be 0 or 1. LOGICAL*1
的反向转换是直的整数零扩展(gfort),因此,为了遵守上面链接的文档中的合同,显然期望LOGICAL
值为0或1。
But in general, the situation for these conversions is a bit of a mess , so I'd just stay away from them. 但总的来说,这些转换的情况有点 混乱 ,所以我只是远离它们。
So, long story short: avoid putting INTEGER
data into LOGICAL
values, as it is bad even in Fortran, and make sure to use the correct compiler flag to get the ABI-compliant representation for booleans, and interoperability with C/C++ should be fine. 所以,长话短说:避免将
INTEGER
数据放入LOGICAL
值,因为即使在Fortran中它也很糟糕,并确保使用正确的编译器标志来获得布尔值的ABI兼容表示,并且与C / C ++的互操作性应该没问题。 But to be extra safe, I'd just use plain char
on the C++ side. 但为了更安全,我只是在C ++方面使用plain
char
。
Finally, from what I gather from the documentation , in ifort there is some builtin support for interoperability with C, including booleans; 最后,根据我从文档中收集的内容 ,在ifort中有一些内置支持与C的互操作性,包括布尔值; you may try to leverage it.
你可以尝试利用它。
This is what happens when you violate your contract with both the language and the compiler. 当您违反与语言和编译器的合同时会发生这种情况。
You probably heard somewhere that "zero is false", and "non-zero is true". 你可能听说过“零是假的”,“非零是真的”。 That holds when you stick to the language's parameters, statically converting an
int
to bool
or vice versa. 当你坚持语言的参数,静态地将
int
转换为bool
或反之亦然时,这就成立了。
It does not hold when you start messing with bit representations. 当你开始搞乱位表示时它不成立。 In that case, you break your contract, and enter the realm of (at the very least) implementation-defined behaviour.
在这种情况下,您违反合同,并进入(至少)实现定义的行为领域。
Simply don't do that. 根本不要那样做。
It's not up to you how a bool
is stored in memory. 这不取决于
bool
如何存储在内存中。 It's up to the compiler. 这取决于编译器。 If you want to change a
bool
's value, either assign true
/ false
, or assign an integer and use the proper conversion mechanisms provided by C++. 如果要更改
bool
的值,请指定true
/ false
,或者指定一个整数并使用C ++提供的正确转换机制。
The C++ standard used to actually give a specific call-out to how using bool
in this manner is naughty and bad and evil ( "Using a bool
value in ways described by this document as 'undefined',such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true
nor false
." ), though it was removed in C++20 for editorial reasons . C ++标准实际上给出了一个特定的调用,告诉我们如何以这种方式使用
bool
是顽皮的,坏的和邪恶的( “使用bool
值以本文档描述的方式为'undefined',例如通过检查一个值未初始化的自动对象,可能会使它表现得好像既不是true
也不是false
。“ ),尽管出于编辑原因它已在C ++ 20中被删除 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.