简体   繁体   中英

Is a C-Style array to std::array transition completely safe for arrays?

First time questioner :) Is it possible to transform global c-style arrays to std::arrays without breaking the code? I'm working on a project which consists of decompiling the source code of an old game. We have already managed to refactor a large part of the disassembly/decompilation output. Since it's automatic there are still sections like

  int a;
  int b[50];
  *(&a + 100) = xxx;

or

  int b[50];
  int a;
  *(&a - 100) = xxx;

and other types of crazy pointer arithmetics remaining, which have yet to be refactored manually. But we would like to use bounds checking for sections that have been (presumably) correctly changed to arrays.

( Ignore the text in italics, I'm keeping it just for consistency in the comments ) I've found one problem so far with chaning every array: sizeof(class containing array) would change. This could break code in some cycles, for example someclass somearray[100]; //for example (sizeof(somearray[0]) == 50) is true int pointer = (int)somearray; pointer += 100 ((someclass )pointer)->doSomething(); . because pointer +=100 wouldn't be pointing to the second element, but somewhere inside the first, or even zeroth, I'm not sure (don't forget it's automatically decompiled code, hence the ugliness).

I'm thinking of changing every global array to std::array and every instance of accessing the array without the [] operator to array._Elems .

Are there any problems that might arise if I were to change global arrays to std::arrays in code such as this?

Edit You were right about the size not changing. I had an error in the testing functions. So I'll expand the question:

Is it safe to change every c-style array to std::array?

Edit Our current code is actually only runnable in debug mode, since it doesn't move variables around. Release mode crashes basically at the start of the program.

Edit Since there seems to be some confusion what this question is about, let me clarify: Is there some guarantee that there's no other member in the array, other than T elems [N] ? Can I count on having

array<array<int,10>, 10> varname;
int* ptr = &varname[0][0];
ptr += 10

and be sure that ptr is pointing at varname[1][0] regardless of implementation details? Although it's guaranteed that an array is contiguous, I'm not sure about this. The standard contains an implementation, but I'm not sure whether that's an example implementation or the actual definition which every implementation should adhere with iterator and const_iterator being the only things that are implementation specific, since only those have the words implementation-defined (I don't have the latest specifiation at hand, so there might be some other differences).

For one-dimensional arrays, this might work in all cases, the 2D case is more tricky:

In principle, it is possible for the std::array < > template to only consist of the array itself because its length argument is a compile time variable which does not need to be stored. However, your STL-implementation might have chosen to store it anyway, or any other data it needs. So, while '&a[n] == &a[0] + n' holds for any std::array, the expression '&a[n][0] == &a[0][0] + n*arrayWidth' might not hold for a 'std::array < std::array, arrayHeight >'.

Still you might want to check whether 'sizeof(std::array < int, 100 >) == sizeof(int) * 100' with your STL-implementation. If it does, it should be safe to replace even the 2D arrays.

I wonder how that replacement should even work in code full of pointer arithmetic.

/// @file array_eval.cpp
#include <iostream>
#include <array>
#include <algorithm>


int main() {
    auto dump = [](const int& n){std::cout << n << " ";};

#ifdef DO_FAIL
    std::array<int, 10> arr;
#else    
    int arr[10];
#endif

    // this does not work for std::arrays
    int* p = arr; 

    std::for_each(p, p+10, dump);
    std::cout << std::endl;
    return 0;
}

And

g++ -Wall -pedantic -std=c++11 -DDO_FAIL array_eval.cpp 

of course fails:

array_eval.cpp: In function ‘int main()’:
array_eval.cpp:17:14: error: cannot convert ‘std::array<int, 10ul>’ to ‘int*’ in initialization
     int* p = arr; 
              ^

It depends on STL implementation. I means, the standard does not prevent to implement std::array using more members, or reserving more memory of that is really necessary (for example, for debugging), but I think is very improbable to found one std::array implementation without just use more of T elem[N]; data member.

If we assume the std::array implementation includes just one field for store the data and it allocate just the necessary memory (not more), int v[100]; and where the data is stored in array<int, 100> v; will have the same layout, since from the standard:

[array.overview 23.3.2.1 p1]:

The elements of an array are stored contiguously, meaning that if a is an array<T, N> then it obeys the identity &a[n] == &a[0] + n for all 0 <= n < N .

and [class.mem 9.2 p20]:

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast , points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note ]

Anyway, that depends on compiler and STL implementation. But the reversed code depends on compiler too. Why are you assuming int a; int b[50]; a; int b[50]; will locate a and then array of b in memory in that order and not in the other if that declarations are not part of a struct or class ? The compiler would decide other thing for performance reasons (but I see that is improbable).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM