简体   繁体   English

为什么在带有向量的 for 循环结束时会发生这种情况 c++

[英]Why does this happen at the end of the for loop with vectors c++

I want to erase repeated elements in a vector;我想删除向量中的重复元素; I used a for loop to check if the next element in the vector is the same as the current element in the iteration and then delete it if true but, for some reason, it deletes the last element without being equal.我使用for循环来检查向量中的下一个元素是否与迭代中的当前元素相同,如果为真则删除它,但由于某种原因,它删除了最后一个元素而不相等。

Here's my code:这是我的代码:

#include <string>
#include <vector>
#include <iostream>

using namespace std;

template <typename T> vector<T> uniqueInOrder(const vector<T>& iterable){
    vector<T> coolestVector = iterable;
    for (int i = 0; i < coolestVector.size(); i++)
    {
        if (coolestVector[i] == coolestVector[i+1]){
            coolestVector.erase(coolestVector.begin()+i);
            i--;
        }
        /*for (int i = 0; i < coolestVector.size(); i++)
        {
            cout<<coolestVector[i]<<", ";
        }
        cout<<i<<", ";
        cout<<coolestVector.size();
        cout<<endl;*/
    }

    for (int i = 0; i < coolestVector.size(); i++)
    {
        cout<<coolestVector[i]<<endl;
    }
    
    return coolestVector;
}
vector<char> uniqueInOrder(const string& iterable){
    vector<char> coolVector = {};
    for (int i = 0; i < iterable.size(); i++)
    {
        coolVector.push_back(iterable[i]);
    }
    const vector<char> realVector = coolVector;
    uniqueInOrder(realVector);
}

int main(){
    const string test = "AAAABBBCCDAABBB";
    uniqueInOrder(test);
}

output:输出:

vector 0: A, A, A, B, B, B, C, C, D, A, A, B, B, B, iterator value -1, vector size 14
vector 0: A, A, B, B, B, C, C, D, A, A, B, B, B, iterator value -1, vector size 13
vector 0: A, B, B, B, C, C, D, A, A, B, B, B, iterator value -1, vector size 12
vector 1: A, B, B, B, C, C, D, A, A, B, B, B, iterator value 0, vector size 12
vector 1: A, B, B, C, C, D, A, A, B, B, B, iterator value 0, vector size 11
vector 1: A, B, C, C, D, A, A, B, B, B, iterator value 0, vector size 10
vector 2: A, B, C, C, D, A, A, B, B, B, iterator value 1, vector size 10
vector 2: A, B, C, D, A, A, B, B, B, iterator value 1, vector size 9
vector 3: A, B, C, D, A, A, B, B, B, iterator value 2, vector size 9
vector 4: A, B, C, D, A, A, B, B, B, iterator value 3, vector size 9
vector 4: A, B, C, D, A, B, B, B, iterator value 3, vector size 8
vector 5: A, B, C, D, A, B, B, B, iterator value 4, vector size 8
vector 5: A, B, C, D, A, B, B, iterator value 4, vector size 7
vector 5: A, B, C, D, A, B, iterator value 4, vector size 6
vector 5: A, B, C, D, A, iterator value 4, vector size 5
A
B
C
D
A

Expected:预期的:

A
B
C
D
A
B

Why is the code incorrect?为什么代码不正确?

Many people learn to iterate over an array or vector by memorizing the setup许多人通过记住设置来学习迭代数组或向量

for (int i = 0; i < X.size(); i++)

This is good for basic loops, but there are times when it is inadequate.这对基本循环很有用,但有时它是不够的。 Do you know why the conditional is i < X.size() ?你知道为什么条件是i < X.size()吗? A basic understanding would say that this conditional ensures that the loop body is executed a number of times equal to the size of X .一个基本的理解是,这个条件确保循环体执行的次数等于X的大小。 This is not wrong, but that rationale is more applicable when i is not used inside the loop body.这并没有错,但是当i不在循环体内时更适用。 (As an example, that rationale would apply equally well if i started at 1 and the loop continued as long as i <= X.size() , yet that is not a good way to iterate over an array/vector.) (例如,如果i1开始并且只要i <= X.size()循环继续,那么这个基本原理同样适用,但这不是迭代数组/向量的好方法。)

A deeper understanding looks at how i is used in the loop body.更深入的理解是看i在循环体中是如何使用的。 A common example is printing the elements of X .一个常见的例子是打印X的元素。 (This is preliminary; we'll return to the question's situation later.) A loop that prints the elements of X might look like the following: (这是初步的;我们稍后会回到问题的情况。)打印X元素的循环可能如下所示:

for (int i = 0; i < X.size(); i++)
    std::cout << X[i] << ' ';

Note the index given to X – this is key to the loop's condition.注意给X的索引——这是循环条件的关键。 The condition's deeper purpose is to ensure that the indices stay within the valid range.条件的更深层目的是确保索引保持在有效范围内。 The indices given to X must not drop below 0 and they must remain less than X.size() .X的索引不得低于0并且它们必须保持小于X.size() That is, index < X.size() where index gets replaced by whatever you have in the brackets.也就是说, index < X.size()其中index被括号中的任何内容替换。 In this case, the thing in the brackets is i , so the condition becomes the familiar i < X.sixe() .在这种情况下,括号中的内容是i ,因此条件变成了熟悉的i < X.sixe()

Now let's look at the question's code.现在让我们看看问题的代码。

for (int i = 0; i < coolestVector.size(); i++)
{
    if (coolestVector[i] == coolestVector[i+1]){
        // Code not using operator[]
    }
    // Diagnostics
}

There are two places where operator[] is used inside the loop.在循环中有两个地方使用了operator[] Apply the above "deeper understanding" to each of them, then combine the resulting conditionals with a logical "and".将上述“更深层次的理解”应用于它们中的每一个,然后将结果条件与逻辑“与”组合起来。

  • The first index is i , so the goal index < X.size() becomes i < coolestVector.size() for this case.第一个索引是i ,因此对于这种情况,目标index < X.size()变为i < coolestVector.size()
  • The second index is i+1 , so the goal index < X.size() becomes i+1 < coolestVector.size() for this case.第二个索引是i+1 ,因此对于这种情况,目标index < X.size()变为i+1 < coolestVector.size()

Combining these gives i < coolestVector.size() && i+1 < coolestVector.size() .结合这些得到i < coolestVector.size() && i+1 < coolestVector.size() This is what the loop's conditional should be to ensure that the indices stay within the valid range.这是循环的条件应该是什么,以确保索引保持在有效范围内。 Something logically equivalent would also work.逻辑上等效的东西也可以工作。 Assuming that i+1 does not overflow (which would entail another class of problems), if i+1 is less than some value then so is i .假设i+1不会溢出(这将带来另一类问题),如果i+1小于某个值,那么i It is enough to check that i+1 is in range, so we can simplify this conditional to i+1 < coolestVector.size() .检查i+1是否在范围内就足够了,因此我们可以将此条件简化为i+1 < coolestVector.size()

for (int i = 0; i+1 < coolestVector.size(); i++)  // <--  Fixed!
{
    if (coolestVector[i] == coolestVector[i+1]){
        // Code not using operator[]
    }
    // Diagnostics
}

(I know, that was a lot of writing to say "add one". The point is to give you – and future readers – the tools to get the next loop correct.) (我知道,写了很多文章都是为了说“添加一个”。重点是为您和未来的读者提供正确使用下一个循环的工具。)


Note that the same principle applies to the start of the loop.请注意,同样的原则适用于循环的开始。 We start i at 0 so that i >= 0 .我们从0开始i以便i >= 0 This happens to imply i+1 >= 0 as well, so in this case there is nothing extra to be done.这恰好也意味着i+1 >= 0 ,所以在这种情况下没有什么额外的事情要做。 However, if one of the used indices was i-1 , then you would need to ensure i-1 >= 0 , which would be done by starting i at 1 .但是,如果使用的索引之一是i-1 ,那么您需要确保i-1 >= 0 ,这可以通过从i开始1来完成。

Look at your indices to determine where your loop control variable should start and stop.查看您的索引以确定您的循环控制变量应该在哪里开始和停止。

I have separated this from my earlier answer because the earlier answer can stand on its own and I do not want it embroiled in the potential controversy that comes from explaining undefined behavior.我已将此与我之前的答案分开,因为之前的答案可以独立存在,我不希望它卷入因解释未定义行为而引起的潜在争议。

Why did the program consistently remove the last element?为什么程序总是删除最后一个元素?

Officially, we are in the realm of undefined behavior, so anything is possible.正式地,我们处于未定义行为的领域,所以一切皆有可能。 However, it is very likely that this behavior will be seen in all release builds, with two caveats.但是,很可能会在所有发布版本中看到这种行为,但有两个警告。

  1. An earlier element was removed.删除了较早的元素。 If no elements should be removed (a case worth adding to your test suite), then the behavior is unpredictable, possibly a crash but most likely the expected behavior.如果不应该删除任何元素(值得添加到您的测试套件中的情况),则行为是不可预测的,可能是崩溃,但很可能是预期行为。
  2. Move construction leaves behind a copy.移动建设留下副本。 This is true for simple types like char .这适用于像char这样的简单类型。 You likely will not see this behavior for a vector of std::string .对于std::string向量,您可能不会看到这种行为。

When an element in the middle of a std::vector is erased, all of the elements after that element are shifted down an index;std::vector中间的元素被擦除时,该元素之后的所有元素都向下移动一个索引; they are copied (or moved) to the preceding element.它们被复制(或移动)到前一个元素。

A B B C D
  ^
  |-- erase this
A B B C D
  ^ ^ ^    <--- shift and copy (or move)
  B C D
A B C D D
      ^
      |-- Last element in the vector

Note that space is not released upon erasing.请注意,擦除时不会释放空间。 The vector still owns the memory where D used to be;该向量仍然拥有D曾经所在的内存; it's just that accessing the element from outside the vector implementation is undefined behavior.只是从向量实现之外访问元素是未定义的行为。 Also, that memory is unlikely to have its bits changed by the vector in a release build.此外,该内存不太可能在发布版本中由向量更改其位。 So it is very likely that past the end of the vector is a copy of the last element of the vector, unless the move constructor changed it.所以很可能超过向量的末尾是向量最后一个元素的副本,除非移动构造函数改变了它。

Now comes your condition.现在来了你的条件。 When i is coolestVector.size()-1 , you check to see if the last element of the vector ( coolestVector[i] ) equals the element past the end of the vector ( coolestVector[i+1] ).icoolestVector.size()-1 ,检查向量的最后一个元素 ( coolestVector[i] ) 是否等于向量末尾的元素 ( coolestVector[i+1] )。 A release build will not verify that the index is valid, and the operating system does not care if that location in memory is accessed, so this comparison is likely to go through as one might naively expect.发布版本不会验证索引是否有效,并且操作系统不关心是否访问了内存中的该位置,因此这种比较可能会像人们天真地预期的那样进行。 Does the last element of the vector equal the thing from which it was copied?向量的最后一个元素是否等于复制它的元素? Yes!是的! OK, delete the last element.好的,删除最后一个元素。

Very likely in a release build, but don't rely on it.很可能在发布版本中,但不要依赖它。

You can use std::set for unique elements that too in linear time O(1).您可以将 std::set 用于线性时间 O(1) 中的唯一元素。

void Unique_Vector(vector<string>&v,int size)
{
   std::set<string>s;
   for(auto i : v)
   {
      s.insert(i);
   }
   std::cout<<"Vector after removing duplicate :";
   for(auto i : s)
   {
       std::cout<<i<<" ";
   }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM