简体   繁体   English

有没有*任何*方法来获得C ++ / G ++中C风格数组的长度?

[英]Is there *any* way to get the length of a C-style array in C++/G++?

I've been trying to implement a lengthof (T* v) function for quite a while, so far without any success. 我一直在努力实现一段时间(T * v)函数,到目前为止还没有任何成功。

There are the two basic, well-known solutions for T v[n] arrays, both of which are useless or even dangerous once the array has been decayed into a T* v pointer. 对于T v [n]阵列有两种基本的,众所周知的解决方案,一旦阵列衰减成T * v指针,这两种解决方案都是无用的甚至是危险的。

#define SIZE(v) (sizeof(v) / sizeof(v[0]))

template <class T, size_t n>
size_t lengthof (T (&) [n])
{
    return n;
}

There are workarounds involving wrapper classes and containers like STLSoft's array_proxy, boost::array, std::vector, etc. All of them have drawbacks, and lack the simplicity, syntactic sugar and widespread usage of arrays. 有一些涉及包装类和容器的解决方法,如STLSoft的array_proxy,boost :: array,std :: vector等。所有这些都有缺点,缺乏简单性,语法糖和数组的广泛使用。

There are myths about solutions involving compiler-specific calls that are normally used by the compiler when delete [] needs to know the length of the array. 关于涉及特定于编译器的调用的解决方案存在神话,当delete []需要知道数组的长度时,编译器通常会使用这些调用。 According to the C++ FAQ Lite 16.14, there are two techniques used by compilers to know how much memory to deallocate: over-allocation and associative arrays. 根据C ++ FAQ Lite 16.14,编译器使用两种技术来了解要释放多少内存:过度分配和关联数组。 At over-allocation it allocates one wordsize more, and puts the length of the array before the first object. 在过度分配时,它会分配一个更多的单词,并将数组的长度放在第一个对象之前。 The other method obviously stores the lengths in an associative array. 另一种方法显然将长度存储在关联数组中。 Is it possible to know which method G++ uses, and to extract the appropriate array length? 是否有可能知道G ++使用哪种方法,并提取适当的数组长度? What about overheads and paddings? 开销和填充怎么样? Any hope for non-compiler-specific code? 对非特定于编译器的代码有什么希望吗? Or even non-platform-specific G++ builtins? 甚至是非平台特定的G ++内置程序?

There are also solutions involving overloading operator new [] and operator delete [], which I implemented: 还有一些解决方案涉及重载operator new []和operator delete [],我实现了:

std::map<void*, size_t> arrayLengthMap;

inline void* operator new [] (size_t n)
throw (std::bad_alloc)
{
    void* ptr = GC_malloc(n);
    arrayLengthMap[ptr] = n;
    return ptr;
}

inline void operator delete [] (void* ptr)
throw ()
{
    arrayLengthMap.erase(ptr);
    GC_free(ptr);
}

template <class T>
inline size_t lengthof (T* ptr)
{
    std::map<void*, size_t>::const_iterator it = arrayLengthMap.find(ptr);
    if( it == arrayLengthMap.end() ){
        throw std::bad_alloc();
    }
    return it->second / sizeof(T);
}

It was working nicely until I got a strange error: lengthof couldn't find an array. 它工作得很好,直到我得到一个奇怪的错误:lengthof找不到数组。 As it turned out, G++ allocated 8 more bytes at the start of this specific array than it should have. 事实证明,G ++在这个特定数组的开头分配了8个字节,而不是应该拥有的字节数。 Though operator new [] should have returned the start of the entire array, call it ptr, the calling code got ptr+8 instead, so lengthof(ptr+8) obviously failed with the exception (even if it did not, it could have potentially returned a wrong array size). 虽然operator new []应该返回整个数组的开头,称之为ptr,调用代码改为ptr + 8,所以lengthof(ptr + 8)显然因异常而失败(即使它没有,它可能有可能返回错误的数组大小)。 Are those 8 bytes some kind of overhead or padding? 这8个字节是某种开销还是填充? Can not be the previously mentioned over-allocation, the function worked correctly for many arrays. 不能是前面提到的过度分配,该功能对许多阵列都能正常工作。 What is it and how to disable or work around it, assuming it is possible to use G++ specific calls or trickery? 它是什么以及如何禁用或解决它,假设可以使用G ++特定的调用或欺骗?

Edit: Due to the numerous ways it is possible to allocate C-style arrays, it is not generally possible to tell the length of an arbitrary array by its pointer, just as Oli Charlesworth suggested. 编辑:由于有多种方式可以分配C风格的数组,通常不可能通过指针来判断任意数组的长度,正如Oli Charlesworth建议的那样。 But it is possible for non-decayed static arrays (see the template function above), and arrays allocated with a custom operator new [] (size_t, size_t), based on an idea by Ben Voigt: 但是根据Ben Voigt的想法,可以使用非衰减的静态数组(参见上面的模板函数)和使用自定义运算符new [](size_t,size_t)分配的数组:

#include <gc/gc.h>
#include <gc/gc_cpp.h>
#include <iostream>
#include <map>

typedef std::map<void*, std::pair<size_t, size_t> > ArrayLengthMap;
ArrayLengthMap arrayLengthMap;

inline void* operator new [] (size_t size, size_t count)
throw (std::bad_alloc)
{
    void* ptr = GC_malloc(size);
    arrayLengthMap[ptr] = std::pair<size_t, size_t>(size, count);
    return ptr;
}

inline void operator delete [] (void* ptr)
throw ()
{
    ArrayLengthMap::const_iterator it = arrayLengthMap.upper_bound(ptr);
    it--;
    if( it->first <= ptr and ptr < it->first + it->second.first ){
        arrayLengthMap.erase(it->first);
    }
    GC_free(ptr);
}

inline size_t lengthof (void* ptr)
{
    ArrayLengthMap::const_iterator it = arrayLengthMap.upper_bound(ptr);
    it--;
    if( it->first <= ptr and ptr < it->first + it->second.first ){
        return it->second.second;
    }
    throw std::bad_alloc();
}

int main (int argc, char* argv[])
{
    int* v = new (112) int[112];
    std::cout << lengthof(v) << std::endl;
}

Unfortunately due to arbitrary overheads and paddings by the compiler, there is no reliable way so far to determine the length of a dynamic array in a custom operator new [] (size_t), unless we assume that the padding is smaller than the size of one of the elements of the array. 不幸的是,由于编译器的任意开销和填充,到目前为止还没有可靠的方法来确定自定义运算符new [](size_t)中动态数组的长度,除非我们假设填充小于1的大小数组的元素。

However there are other kinds of arrays as well for which length calculation might be possible, as Ben Voigt suggested, thus it should be possible and desirable to construct a wrapper class that can accept several kinds of arrays (and their lengths) in its constructors, and is implicitly or explicitly convertible to other wrapper classes and array types. 然而,正如Ben Voigt所建议的,还有其他类型的数组也可以进行长度计算,因此构造一个可以在其构造函数中接受多种数组(及其长度)的包装类是可能的,也是可取的,并隐式或显式转换为其他包装类和数组类型。 Different lifetimes of different kinds of arrays might be a problem, but it could be solved with garbage collection. 不同类型阵列的不同生命周期可能是一个问题,但它可以通过垃圾收集来解决。

To answer this: 要回答这个问题:

Any hope for non-compiler-specific code? 对非特定于编译器的代码有什么希望吗?

No. 没有。

More generally, if you find yourself needing to do this, then you probably need to reconsider your design. 更一般地说,如果您发现自己需要这样做,那么您可能需要重新考虑您的设计。 Use a std::vector , for instance. 例如,使用std::vector

Your analysis is mostly correct, however I think you've ignored the fact that types with trivial destructors don't need to store the length, and so overallocation can be different for different types. 您的分析大多是正确的,但是我认为您忽略了这样一个事实,即具有普通析构函数的类型不需要存储长度,因此对于不同类型的分配可能会有所不同。

The standard allows operator new[] to steal a few bytes for its own use, so you'll have to do a range check on the pointer instead of an exact match. 该标准允许operator new[]窃取一些字节供自己使用,因此您必须对指针进行范围检查而不是完全匹配。 std::map probably won't be efficient for this, but a sorted vector should be (can be binary searched). std::map可能对此没有效率,但是有序矢量应该是(可以二进制搜索)。 A balanced tree should also work really well. 平衡的树也应该工作得很好。

Some time ago, I used a similar thing to monitor memory leaks: 前段时间,我用类似的东西来监控内存泄漏:

When asked to allocate size bytes of data, I would alloc size + 4 bytes and store the length of the allocation in the first 4 bytes: 当被要求分配大小字节数据时,我会分配大小 +4字节并在前4个字节中存储分配的长度:

static unsigned int total_still_alloced = 0;
void *sys_malloc(UINT size)
{
#if ENABLED( MEMLEAK_CHECK )
  void *result = malloc(size+sizeof(UINT )); 
  if(result)
  {
    memset(result,0,size+sizeof(UINT ));
    *(UINT *)result = size;
    total_still_alloced += size;
    return (void*)((UINT*)result+sizeof(UINT));
  }
  else
  {
    return result;
  }
#else
  void *result = malloc(size);
  if(result) memset(result,0,size);
  return result;
#endif
}

void sys_free(void *p)
{
  if(p != NULL)
  {
#if ENABLED( MEMLEAK_CHECK )
    UINT * real_address = (UINT *)(p)-sizeof(UINT);
    total_still_alloced-= *((UINT *)real_address);

    free((void*)real_address);
#else
    free(p);
#endif
  }
}

In your case, retrieving the allocated size is a matter of shifting the provided address by 4 and read the value. 在您的情况下,检索分配的大小是将提供的地址移动4并读取值。

Note that if you have memory corruption somewhere... you'll get invalid results. 请注意,如果某处存在内存损坏...您将获得无效结果。 Note also that it is often how malloc works internally: putting the size of the allocation on a hidden field before the adress returned. 另请注意,malloc通常是如何在内部工作的:在返回地址之前将分配的大小放在隐藏字段上。 On some architectures, I don't even have to allocate more, using the system malloc is sufficient. 在某些架构上,我甚至不需要分配更多,使用系统 malloc就足够了。

That's an invasive way of doing it... but it works (provided you allocate everything with these modified allocation routines, AND that you know the starting address of your array). 这是一种侵入性的方式......但是它有效(假设您使用这些修改的分配例程分配所有内容,并且您知道数组的起始地址)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM