简体   繁体   中英

Inserting and deleting elements from vector *at the same time*

Goal

Consider a sorted std::vector x . We want to erase from this vector all elements at positions indicated by vector positionsToErase . We also want to insert the values of vector valuesToInsert at positions positionsToInsert .

These deletions and insertions must happen at the same time , in the sense that if we erase first, then it will invalidates the positions at which we want to insert values (and vice-versa). I think that will be made clear with the below example

Example

Example of function definition

template<typename T>
void insertEraseAtPositions(
   std::vector<T>& x,                       // vector to modify. Is sorted and must remain sorted
   std::vector<T>& valuesToInsert,          // is not sorted
   std::vector<size_t>& positionsToInsert,  // is not sorted. This could be figured out inside the function but I happen to already know the positions at which values must be inserted
   std::vector<size_t>& positionsToErase    // is not sorted
);

Note that non are constant and modifications can be made in-place.

Example of arguments

std::vector<int> x = {0, 10, 20, 21, 30, 50, 60, 70, 81, 90}; // vector to modify
std::vector<int> valuesToInsert = {40, 80, 100}; // Values to insert are '40', '80' and '100'    
std::vector<size_t> positionsToErase  = {3, 8};   // Erase elements '21' and '81'
std::vector<size_t> positionsToInsert = {5, 8, 10};  // Insert where are currently located the elements '50', '81' and past the current last element.

Expected output

x = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100};

Important notes

Performance is very important and it is hence not possible to insert and erase one-by-one (even if we progressively modify positions accordingly) as it would involve way too many copies (or move).

Typically, x is of size 1,000 to 100,000. positionsToInsert (and valuesToInsert ) are of size 1-20 and positionsToErase is of size 1-5. x typically has a capacity that allows inserting the values without reallocating. I hence expect (but might be wrong) that modifications in-place would be faster.

I can also supply iterators instead of indices ( std::vector<std::vector<T>::iterator> instead of std::vector<size_t> ) for positionsToErase and positionsToInsert if you prefer.

Current work

I wrote a code to insert at positions but I failed to include the possibility to erase too. Here is the code in case it helps.

// Return indices representing the order of elements
template <typename T>
std::vector<uint32_t> sort_indices(const std::vector<T> &v) {

  // initialize original index locations
  std::vector<uint32_t> idx(v.size());
  std::iota(idx.begin(), idx.end(), 0);

  // sort indexes based on comparing values in v
  std::sort(idx.begin(), idx.end(),
       [&v](uint32_t i1, uint32_t i2) {return v[i1] < v[i2];});

  return idx;
}

template <typename T>
void reorder(std::vector<T>& v, std::vector<uint32_t>& order)  
{   
    auto v2 = v;
    for (uint32_t i = 0 ; i < v.size() ; i++)
    {
        v[i] = v2[order[i]];
    }
}

//// Insert multiple elements at specified positions into vector
template<typename T>
void insertAtPositions(std::vector<T>& x, std::vector<T>& values, std::vector<size_t>& positions)
{
  // assert values and positions are the same size
  assert(values.size() == positions.size());

  // Special case - values is empty
  if (values.size() == 0) return;

  // Special case - single value to insert
  if (values.size() == 1)
  {
    x.insert(positions.front(), values.front());
    return;
  }

  // sort the values and the positions where those values should be inserted
  auto indices = sort_indices(positions);
  reorder(positions, indices);
  reorder(values, indices);

  // Special case - x is empty
  if (x.size() == 0)
  {
      x.swap(values);
      return;
  }
  
  // Allocate memory to x
  x.resize(x.size() + values.size());
  
  // Move things to make room for insertions and insert
  int pos_index = positions.size()-1;
  for (size_t i = x.size()-1 ; pos_index >= 0 ; --i)
  {
    if (i == positions[pos_index] + pos_index)
    {
      // A new value should go at index i
      x[i] = std::move(values[pos_index]);
      --pos_index;
    } else
    {
      // The value from index 'i-pos_index-1' must go to index 'i'
      x[i] = std::move(x[i-pos_index-1]);
    }
  }
}

Modifying it in place is a no-go.

Consider that you have to insert something at every position. You would need to copy every single item into a temp place then copy them back.

You might argue that you could do it from the end, backwards. But if we have some deletions we would also need to store some of the elements there, potentially getting back to copying every element into some temp storage and back.

I think the fastest way would be to allocate a new array, and build it up, using the original as temp storage. This way you are guaranteed that each element is copied exactly once.

Now, depending on the types used (like ints, or pointers) this could be a lot faster than anything else you might cook up. If copies are expensive consider using moves, or pointers.

If you are worried about performance, you should benchmark you code and tune it. It's hard to argue precisely about performance without data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM