简体   繁体   中英

uninitialized_copy memcpy/memmove optimization

I've recently started to examine the STL in the MSVC's implementation. There are some nice tricks there, however I don't know why the following criteria is used.

The std::uninitialized_copy is optimized to a simple memcpy/memmove if some conditions are met. As my understanding the input range can be memcpy 'd to the uninitialized area if the target type U is_trivially_copy_constructible from source type T.

However the MSVC implementation checks a hell lot of thing before choosing the memcpy instead of the one-by-one copy-constructing of elements. I did not want to paste the related code here, instead I'm sharing it through pastebin if anyone is interested: https://pastebin.com/Sa4Q7Qj0

The base algorithm for the uninitialized_copy is something like this (exception-handling is omitted for readibility)

template <typename T, typename... Args>
inline void construct_in_place(T& obj, Args&&... args)
    ::new (static_cast<void*>(addressof(obj)) T(forward<Args>(args)...);

template <typename In, typename Out>
inline Out uninitialized_copy(In first, In last, Out dest)
    for (; first != last; ++first, ++dest)
        construct_in_place(*dest, *first);

This can be optimized to a memcpy/memmove if the copy-constructing doesn't do any 'special' thing (trivially copy-constructible).

The MS's implementation requires the following:

  • T trivially assignable to U
  • T trivially copyable to U
  • T being trivial
  • extra checks (like sizeof(T) == sizeof(U)) if T != U

So for example the following struct cannot be memcpy 'd:

struct Foo
    int i;
    Foo() : i(10) { }

but the following is ok:

struct Foo
    int i;
    Foo() = default; // or simply omit

Shouldn't it be enough to check if type U can be trivially copy-constructed from type T? Because all that's the uninitialized_copy does.

For example, I can't see why the following is not memcpy'd by the MS's STL implementation (NOTE: I know the reason, it is the user-defined constructor, but I don't understand the logic behind it):

struct Foo
    int i;

    Foo() noexcept
        : i(10)

    Foo(const Foo&) = default;

void test()
    // please forgive me...
    uint8 raw[256];
    Foo* dest = (Foo*)raw;
    Foo src[] = { Foo(), Foo() };

    bool b = std::is_trivially_copy_constructible<Foo>::value;  // true
    bool b2 = std::is_trivially_copyable<Foo>::value;           // true

    memcpy(dest, src, sizeof(src)); // seems ok

    // uninitialized_copy does not use memcpy/memmove, it calls the copy-ctor one-by-one
    std::uninitialized_copy(src, src + sizeof(src) / sizeof(src[0]), dest);

Related SO post: Why doesn't gcc use memmove in std::uninitialized_copy?


As @Igor Tandetnik pointed out in the comments, it is not safe to say if there is no user-defined copy constructor then the type T is trivially copy-constructible. He provided the following example:

struct Foo
    std::string data;

In this example, there is no user-defined copy constructor and it is still not trivially copy-constructible. Thank you for the correction, I modified the original post based on the feedback.

uninitialized_copy has two responsibilities: First, it has to make sure that the right bit-pattern gets into the destination buffer. Second, it has to start the lifetime of the C++ objects in that buffer. That is, it must call a constructor of some kind, unless the C++ Standard specifically grants it permission to skip that constructor call.

According to my very incomplete research, it appears that right now only trivially copyable types are guaranteed to have their bit patterns preserved by memcpy / memmove ; memcpying any other kind of type (even if it happens to be trivially copy-constructible and/or trivially copy-assignable!) formally produces undefined behavior.

And furthermore, it appears that right now only trivial types can "pop into existence" without a constructor call. ( P0593 "Implicit creation of objects..." proposes a lot of changes in this area, maybe in C++2b.)

Jonathan Wakely's comment on libstdc++ bug 68350 seems to indicate that GNU libstdc++ is trying to remain within the letter of the law by never "popping into existence" any objects of non-trivial type — even though, as a C++ implementation, they do have latitude to exploit platform-specific behavior in the name of performance. I would guess that MSVC is following similar logic, for similar reasons (whatever those reasons are).

You can see the vendors' unwillingness to "pop objects into existence" by comparing their willingness to optimize std::copy versus std::uninitialized_copy on class types which are "trivially copyable but not trivial." Being trivially copyable means std::copy can use memcpy to assign-over-existing-objects; but std::uninitialized_copy , to make those objects pop into existence in the first place, still feels the need to call some constructor in a loop — even if it's the trivial copy constructor!

class C { int i; public: C() = default; };
class D { int i; public: D() {} };
static_assert(std::is_trivially_copyable_v<C> && !std::is_aggregate_v<C>);
static_assert(std::is_trivially_copyable_v<D> && !std::is_aggregate_v<D>);

void copyCs(C *p, C *q, int n) {
    std::copy(p, p+n, q);  // GNU and MSVC both optimize
    std::uninitialized_copy(p, p+n, q);  // GNU and MSVC both optimize
void copyDs(D *p, D *q, int n) {
    std::copy(p, p+n, q);  // GNU and MSVC both optimize
    std::uninitialized_copy(p, p+n, q);  // neither GNU nor MSVC optimizes :(

You wrote:

Shouldn't it be enough to check if type U can be trivially copy-constructed from type T? Because that's all uninitialized_copy does.

Yes, but when T and U are different , you're not doing "trivial copy-construction"; you're doing a "trivial construction" that is not copy-construction. And unfortunately the C++ Standard defines is_trivially_constructible<T,U> to mean something different from what humans mean by "trivial"! My blog post "Trivially-constructible-from" (July 2018) gives this example:

assert(is_trivially_constructible_v<u64, u64b>);
// Yay!

using u16 = short;
assert(is_trivially_constructible_v<u64, u16>);
// What the...

assert(is_trivially_constructible_v<u64, double>);
// ...oh geez.

This explains some of MSVC's

extra checks (like sizeof(T) == sizeof(U)) if T != U

Specifically, MSVC's _Ptr_cat_helper<T*,U*>::_Really_trivial trait relies on those extra checks to detect some (but not all) common situations where the conversion from T to U is "really" trivial in the human/bitwise sense, and not just trivial in the C++-Standard sense. This allows MSVC to optimize copying an array of int* into an array of const int* , which is something libstdc++ can't do:

using A = int*;
using B = const int*;

void copyAs(A *p, B *q, int n) {
    std::uninitialized_copy(p, p+n, q);  // only MSVC optimizes
void copyBs(B *p, B *q, int n) {
    std::uninitialized_copy(p, p+n, q);  // GNU and MSVC both optimize

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM