简体   繁体   English

[]运算符如何工作?

[英]How does the [] operator work?

I'm working with C, but I think this is a more low level question that isn't language specific. 我正在使用C,但是我认为这是一个更底层的问题,与语言无关。

How does the program correctly grab the right data with array[0] or array[6] regardless of what type of data it holds? 程序如何使用array [0]或array [6]正确地获取正确的数据,而不管它保存的数据类型是什么? Does it store the length internally or have some sort of delimiter to look for? 它在内部存储长度还是要寻找某种分隔符?

the compiler knows the sizeof the underlying datatype and adds the right byte offset to the pointer. 编译器知道基础数据类型的sizeof ,并将正确的字节偏移量添加到指针。

a[10] is equivalent to *(a + 10) which is equivalent to *(10 + a) which in turn is equivalent to 10[a] , no kidding. a[10]等效于*(a + 10) ,后者等效于*(10 + a) ,后者又等效于10[a] ,没有开玩笑。

编译器在编译时计算出大小,并在目标代码中对大小进行硬编码。

I would like to contribute something other than a direct answer. 除了直接回答外,我还想提供一些其他信息。

There is an interesting article on Dennis Ritchie's homepage on the history of C which has quite a bit to say about arrays, array indices, etc. 在Dennis Ritchie的C历史主页上有一篇有趣的文章,其中有很多关于数组,数组索引等方面的文章。

This will probably not directly answer your question, but it may further your understanding of C arrays... and it is an interesting read. 这可能不会直接回答您的问题,但可能会使您对C数组有进一步的了解。

Neither :-) 都不:-)

For an array, the compiler knows: (a) the address of the start of the array, and (b) what type of elements (int, float, double, etc.) the array holds, and hence how long each element is. 对于数组,编译器知道:(a)数组开始的地址,以及(b)数组包含什么类型的元素(int,float,double等),以及每个元素的长度。

With those two pieces of information, finding array[6] is a simple matter of arithmetic: start with the base address, and add 6 times the size of an element. 利用这两条信息,找到array[6]是一个简单的算法问题:从基地址开始,再加上元素大小的6倍。

The compiler substitutes the length of the datatype which is fixed at compile time. 编译器将替换在编译时固定的数据类型的长度。

int getInt(void * memory, offset)
{
     return *((int *)(sizeof(int)*offset + memory))
}

void * chunkOfMemory = malloc(0x1000);
int * intarray = (int *) chunkOfMemory;
printf("%d is equal to %d", getInt(chunkOfMemory, 9), intarray[9]);

The compiler knows the size of each element of the array at compile time. 编译器在编译时知道数组每个元素的大小。 For instance: 例如:

int64_t array[5];
...
int64_t a = array[3];

This will be converted to the pseudo-assembly code: 这将被转换为伪汇编代码:

addr <- array
addr <- addr + 3 * sizeof(int64_t)
//                 ^^^^^^^^^^^^^^^ which the compiler knows is 8
//             ^^^^^^^^^^^^^^^^^^^ which the compiler can replace with 24.
a <- *addr 

The length of the array doesn't matter. 数组的长度无关紧要。

It's compiler magic ! 这是编译器的魔力

The compiler knows the size of the array elements and uses it to calculate the right address. 编译器知道数组元素的大小,并使用它来计算正确的地址。

Yes, you are right it is even lower level question, even assembler has [] operator. 是的,您是对的,甚至更低级的问题,甚至汇编器都有[]运算符。 This answer said quite good but my explanation would be: 这个答案说的很好,但我的解释是:

arr[x] is the same as *((void *)(&arr) + x * sizeof(arr[0])) arr[x]*((void *)(&arr) + x * sizeof(arr[0]))

It looks a bit complicated, but generated code is simple. 看起来有点复杂,但是生成的代码很简单。 It is because compiler knows sizeof(arr[0]) and it is hard-coded in compiled code, also (void *)(&arr) is just language standart which protects programmer from dumb mistakes and in compiled code there is no type conversions. 这是因为编译器知道sizeof(arr[0])并且在编译代码中进行了硬编码,而且(void *)(&arr)只是一种语言标准,可以保护程序员免受愚蠢的错误,并且在编译代码中没有类型转换。

One more thing, as I mentioned lower level languages, so need to mention higher. 正如我提到的低级语言一样,还有一件事需要提高。 Using them you can overload operator and make it do whatever you want. 使用它们,您可以重载运算符,并使它做您想做的任何事情。

No it doesn't. 不,不是。 It just get/set element at address array + X*sizeof(TypeOfArrayEl) so you can easily get out of bounds and no one might give you error at that time. 它只是在地址array + X*sizeof(TypeOfArrayEl)处获取/设置元素,因此您可以轻松地越界,并且那时没有人会给您错误。 That's why array[6] is same as 6[array] 这就是为什么array[6]6[array]相同的原因

Assume array is of type int: 假设数组的类型为int:

int array[12];

The [] operator adds whatever value is in the brackets (times the size in bytes of the array type) to the value outside the brackets. []运算符将括号内的任何值(乘以数组类型的字节大小)乘以括号外的值。 Arrays are stored by the implementation as pointers to their first items. 数组由实现存储为指向其第一项的指针。 So that array declaration above allocates 12 * sizeof(int) bytes and makes array point to the first one. 因此,上面的数组声明分配了12 * sizeof(int)个字节,并使array指向第一个。 This leads to wonky stuff like 3[array] giving you the third element in the array. 这会导致像3[array]这样的古怪的东西给您3[array]的第三个元素。

Anyway, the answer to your question is that the compiler looks at the type of the array at compile time and multiplies the thing in the [] by the size of the type held by the array. 无论如何,问题的答案是编译器在编译时查看数组的类型,然后将[]中的值乘以数组持有的类型的大小。

From what I remember C doesn't give you a compile time error if the index is out of bounds. 据我所知,如果索引超出范围,C不会给您一个编译时错误。 Even if you go beyond the bounds the pointer simply provides you the next adjacent memory location. 即使您超出范围,指针也只会为您提供下一个相邻的存储位置。 The only thing that C takes care of is by how many bytes to increase the pointer. C唯一需要照顾的是增加多少字节的指针。 If its an integer array then the pointer will advance by 2 bytes for every increment in the index and for char it'll increment by 1 byte. 如果它是一个整数数组,则指针将针对索引中的每个增量前进2个字节,对于char类型的指针将递增1个字节。

You can always access locations that are out of bounds but that is junk data and you as a programmer has to ensure that you're accessing the right data. 您始终可以访问超出范围但是垃圾数据的位置,而作为程序员,您必须确保访问的数据正确。

That is the price of freedom I guess :) 我猜这就是自由的代价:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM