简体   繁体   English

指向数据成员地址的指针

[英]Pointer to data member address

I have read (Inside C++ object model) that address of pointer to data member in C++ is the offset of data member plus 1? 我读过(Inside C ++对象模型)C ++中指向数据成员的指针的地址是数据成员加1的偏移量?
I am trying this on VC++ 2005 but i am not getting exact offset values. 我在VC ++ 2005上尝试这个,但我没有得到确切的偏移值。
For example: 例如:

Class X{  
  public:  
    int a;  
    int b;  
    int c;
}

void x(){  
  printf("Offsets of a=%d, b=%d, c=%d",&X::a,&X::b,&X::c);
}  

Should print Offsets of a=1, b=5, c=9. 应该打印偏移a = 1,b = 5,c = 9。 But in VC++ 2005 it is coming out to be a=0,b=4,c=8. 但是在VC ++ 2005中,它是a = 0,b = 4,c = 8。
I am not able to understand this behavior. 我无法理解这种行为。
Excerpt from book: 摘录自书:

"That expectation, however, is off by one—a somewhat traditional error for both C and C++ programmers. “然而,对于C和C ++程序员来说,这种期望是一个传统的错误。

The physical offset of the three coordinate members within the class layout are, respectively, either 0, 4, and 8 if the vptr is placed at the end or 4, 8, and 12 if the vptr is placed at the start of the class. 如果vptr放在末尾,则类布局中三个坐标成员的物理偏移分别为0,4和8,如果vptr放置在类的开头,则分别为4,8和12。 The value returned from taking the member's address, however, is always bumped up by 1. Thus the actual values are 1, 5, and 9, and so on. 但是,从获取成员地址返回的值总是增加1.因此实际值为1,5和9,依此类推。 The problem is distinguishing between a pointer to no data member and a pointer to the first data member. 问题是区分没有数据成员的指针和指向第一个数据成员的指针。 Consider for example: 考虑例如:

 float Point3d::*p1 = 0; float Point3d::*p2 = &Point3d::x; // oops: how to distinguish? if ( p1 == p2 ) { cout << " p1 & p2 contain the same value — "; cout << " they must address the same member!" << endl; } 

To distinguish between p1 and p2, each actual member offset value is bumped up by 1. Hence, both the compiler (and the user) must remember to subtract 1 before actually using the value to address a member." 为了区分p1和p2,每个实际成员偏移值都会增加1.因此,编译器(和用户)必须记住在实际使用该值来解析成员之前减去1。

The offset of something is how many units it is from the start. 某事物的偏移量是从一开始就有多少单位。 The first thing is at the start so its offset is zero. 第一件事是开始时它的偏移为零。

Think in terms of your structure being at memory location 100: 根据您在内存位置100的结构来考虑:

100: class X { int a;
104:           int b;
108:           int c;

As you can see, the address of a is the same as the address of the entire structure, so its offset (what you have to add to the structure address to get the item address) is 0. 如您所见, a的地址与整个结构的地址相同,因此它的偏移量(您必须添加到结构地址以获取项目地址)为0。

Note that the ISO standard doesn't specify where the items are laid out in memory. 请注意,ISO标准未指定项目在内存中的布局位置。 Padding bytes to create correct alignment are certainly possible. 填充字节以创建正确的对齐当然是可能的。 In a hypothetical environment where ints were only two bytes but their required alignment was 256 bytes, they wouldn't be at 0, 2 and 4 but rather at 0, 256 and 512. 在一个假设的环境中,整数只有两个字节,但它们所需的对齐是256个字节,它们不是0,2和4,而是0,256和512。


And, if that book you're taking the excerpt from is really Inside the C++ Object Model , it's getting a little long in the tooth. 而且,如果那本书摘录的内容实际上是Inside the C++ Object Model ,那么它就会变得有点长。

The fact that it's from '96 and discusses the internals underneath C++ (waxing lyrical about how good it is to know where the vptr is, missing the whole point that that's working at the wrong abstraction level and you should never care ) dates it quite a bit. 事实上它来自于'96并讨论了C ++下的内部结构(对于知道vptr在哪里是多么好的抒情,错过了那个在错误的抽象级别工作并且你永远不应该关心的重点 )的日期相当于位。 In fact, the introduction even states "Explains the basic implementation of the object-oriented features ..." (my italics). 事实上,引言甚至陈述“解释面向对象特征的基本实现 ......”(我的斜体)。

And the fact that nobody can find anything in the ISO standard saying this behaviour is required, along the fact that neither MSVC not gcc act that way, leads me to believe that, even if this was true of one particular implementation far in the past, it's not true (or required to be true) of all. 事实上,没有人能够在ISO标准中找到任何说这种行为的事实,以及MSVC和gcc都没有采取这种行为这一事实让我相信,即使过去的某个特定实现都是如此,所有这一切都不是真的(或必须是真的)。

The author apparently led the cfront 2.1 and 3 teams and, while this books seems of historical interest, I don't think it's relevant to the modern C++ language (and implementation), at least those bits I've read. 作者显然领导了cfront 2.1和3团队,虽然这本书似乎具有历史意义,但我认为它与现代C ++语言(和实现)并不相关,至少我读过这些内容。

Firstly, the internal representation of values of a pointer to a data member type is an implementation detail. 首先,指向数据成员类型的指针值的内部表示是实现细节。 It can be done in many different ways. 它可以通过许多不同的方式完成。 You came across a description of one possible implementation, where the pointer contains the offset of the member plus 1 . 您遇到了一个可能的实现的描述,其中指针包含成员的偏移加1 It is rather obvious where that "plus 1" come from: that specific implementation wants to reserve the physical zero value ( 0x0 ) for null pointer , so the offset of the first data member (which could easily be 0) has to be transformed to something else to make it different from a null pointer. 很明显,“加1”来自:特定实现想要为空指针保留物理零值( 0x0 ),因此第一个数据成员(可能很容易为0)的偏移量必须转换为使其与空指针不同的其他东西。 Adding 1 to all such pointers solves the problem. 为所有这样的指针添加1可以解决问题。

However, it should be noted that this is a rather cumbersome approach (ie the compiler always has to subtract 1 from the physical value before performing access). 但是,应该注意这是一种相当麻烦的方法(即编译器在执行访问之前总是必须从物理值中减去1)。 That implementation was apparently trying very hard to make sure that all null-pointers are represented by a physical zero-bit pattern. 该实现显然非常难以确保所有空指针都由物理零位模式表示。 To tell the truth, I haven't encountered implementations that follow this approach in practice these days. 说实话,我现在还没有遇到过在实践中遵循这种方法的实现。

Today, most popular implementations (like GCC or MSVC++) use just the plain offset (not adding anything to it) as the internal representation of the pointer to a data member. 今天,大多数流行的实现(如GCC或MSVC ++)仅使用普通偏移(不向其添加任何内容)作为指向数据成员的指针的内部表示。 The physical zero will, of course, no longer work for representing null pointers, so they use some other physical value to represent null pointers, like 0xFFFF... (this is what GCC and MSVC++ use). 当然,物理零点将不再用于表示空指针,因此它们使用一些其他物理值来表示空指针,如0xFFFF... (这是GCC和MSVC ++使用的)。

Secondly, I don't understand what you were trying to say with your p1 and p2 example. 其次,我不明白你用p1p2例子说的是什么。 You are absolutely wrong to assume that the pointers will contain the same value. 假设指针包含相同的值是完全错误的。 They won't. 他们不会。

If we follow the approach described in your post ("offset + 1"), then p1 will receive the physical value of null pointer (apparently a physical 0x0 ), while the p2 whill receive physical value of 0x1 (assuming x has offset 0). 如果我们按照你的帖子中描述的方法(“offset + 1”),那么p1将接收空指针的物理值(显然是物理0x0 ),而p2将接收物理值0x1 (假设x具有偏移0) 。 0x0 and 0x1 are two different values. 0x00x1是两个不同的值。

If we follow the approach used by modern GCC and MSVC++ compilers, then p1 will receive the physical value of 0xFFFF.... (null pointer), while p2 will be assigned a physical 0x0 . 如果我们遵循现代GCC和MSVC ++编译器使用的方法,那么p1将接收物理值0xFFFF.... (空指针),而p2将被分配物理0x0 0xFFFF... and 0x0 are again different values. 0xFFFF...0x0再次是不同的值。

PS I just realized that the p1 and p2 example is actually not yours, but a quote from a book. PS我刚刚意识到p1p2例子实际上不是你的,而是一本书的引用。 Well, the book, once again, is describing the same problem I mentioned above - the conflict of 0 offset with 0x0 representation for null pointer, and offers one possible viable approach to solving that conflict. 好了,这本书再一次描述了我上面提到的同样的问题 - 0偏移与空指针的0x0表示的冲突,并提供了一种可行的方法来解决这种冲突。 But, once again, there are alternative ways to do it, and many compilers today use completely different approaches. 但是,再一次,有其他方法可以做到这一点,今天许多编译器使用完全不同的方法。

The behavior you're getting looks quite reasonable to me. 你得到的行为对我来说看起来很合理。 What sounds wrong is what you read. 你读的是什么听起来不对。

To complement AndreyT's answer: Try running this code on your compiler. 为了补充AndreyT的答案:尝试在编译器上运行此代码。

void test()
{  
    using namespace std;

    int X::* pm = NULL;
    cout << "NULL pointer to member: "
        << " value = " << pm 
        << ", raw byte value = 0x" << hex << *(unsigned int*)&pm << endl;

    pm = &X::a;
    cout << "pointer to member a: "
        << " value = " << pm 
        << ", raw byte value = 0x" << hex << *(unsigned int*)&pm << endl;

    pm = &X::b;
    cout << "pointer to member b: "
        << " value = " << pm 
        << ", raw byte value = 0x" << hex << *(unsigned int*)&pm << endl;
}

On Visual Studio 2008 I get: 在Visual Studio 2008上,我得到:

NULL pointer to member:  value = 0, raw byte value = 0xffffffff
pointer to member a:  value = 1, raw byte value = 0x0
pointer to member b:  value = 1, raw byte value = 0x4

So indeed, this particular compiler is using a special bit pattern to represent a NULL pointer and thus leaving an 0x0 bit pattern as representing a pointer to the first member of an object. 实际上,这个特殊的编译器使用一个特殊的位模式来表示一个NULL指针,从而留下一个0x0位模式来表示一个指向对象第一个成员的指针。

This also means that wherever the compiler generates code to translate such a pointer to an integer or a boolean, it must be taking care to look for that special bit pattern. 这也意味着无论编译器生成代码以将指针转换为整数或布尔值,它都必须注意寻找特殊的位模式。 Thus something like if(pm) or the conversion performed by the << stream operator is actually written by the compiler as a test against the 0xffffffff bit pattern (instead of how we typically like to think of pointer tests being a raw test against address 0x0). 因此, if(pm)<< stream operator'执行的转换实际上是由编译器编写的,作为对0xffffffff位模式的测试(而不是我们通常喜欢将指针测试视为对地址0x0的原始测试) )。

I have read that address of pointer to data member in C++ is the offset of data member plus 1? 我已经读过C ++中指向数据成员的指针的地址是数据成员加1的偏移量?

I have never heard that, and your own empirical evidence shows it's not the case. 我从未听说过,你自己的经验证据表明情况并非如此。 I think you misunderstood an odd property of structs & class in C++. 我认为你误解了C ++中结构和类的奇怪属性。 If they are completely empty, they nevertheless have a size of 1 (so that each element of an array of them has a unique address) 如果它们完全为空,则它们的大小为1(因此它们的数组中的每个元素都有一个唯一的地址)

$9.2/12 is interesting $ 9.2 / 12很有意思

Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. 声明没有插入访问说明符的(非联合)类的非静态数据成员,以便后面的成员在类对象中具有更高的地址。 The order of allocation of nonstatic data members separated by an access-specifier is unspecified (11.1). 由访问说明符分隔的非静态数据成员的分配顺序未指定(11.1)。 Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; 实施对齐要求可能导致两个相邻成员不能立即分配; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1). 因此,可能需要空间来管理虚拟功能(10.3)和虚拟基类(10.1)。

This explains that such behavior is implementation defined. 这解释了这种行为是实现定义的。 However the fact that 'a', 'b' and 'c' are at increasing addresses is in accordance with the Standard. 然而,'a','b'和'c'处于增加地址的事实符合标准。

You may want to check out How are objects stored in memory in C++? 您可能想要查看C ++中的对象如何存储在内存中? which talks about this issue in much more detail. 它更详细地讨论了这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM