简体   繁体   English

查找两个字符串向量的交点

[英]Finding the intersection of two vectors of strings

I have two vectors of strings and want to find the strings which are present in both, filling a third vector with the common elements. 我有两个字符串向量,并想找到两个字符串中都存在的字符串,并用公共元素填充第三个向量。 EDIT: I've added the complete code listing with the respective output so that things are clear. 编辑:我添加了带有相应输出的完整代码清单,以使事情变得清晰。

  std::cout << "size " << m_HLTMap->size() << std::endl;

  /// Vector to store the wanted, present and found triggers
  std::vector<std::string> wantedTriggers;
  wantedTriggers.push_back("L2_xe25");
  wantedTriggers.push_back("L2_vtxbeamspot_FSTracks_L2Star_A");
  std::vector<std::string> allTriggers;

  // Push all the trigger names to a vector
  std::map<std::string, int>::iterator itr = m_HLTMap->begin();
  std::map<std::string, int>::iterator itrLast = m_HLTMap->end();
  for(;itr!=itrLast;++itr)
  {
    allTriggers.push_back((*itr).first);
  }; // End itr

  /// Sort the list of trigger names and find the intersection
  /// Build a typdef to make things clearer
  std::vector<std::string>::iterator wFirst = wantedTriggers.begin();
  std::vector<std::string>::iterator wLast = wantedTriggers.end();
  std::vector<std::string>::iterator aFirst = allTriggers.begin();
  std::vector<std::string>::iterator aLast = allTriggers.end();

  std::vector<std::string> foundTriggers;

  for(;aFirst!=aLast;++aFirst)
  {
    std::cout << "Found:" << (*aFirst) << std::endl; 
  };

  std::vector<std::string>::iterator it;

  std::sort(wFirst, wLast);
  std::sort(aFirst, aLast);
  std::set_intersection(wFirst, wLast, aFirst, aLast, back_inserter(foundTriggers));

  std::cout << "Found this many triggers: " << foundTriggers.size() << std::endl;
  for(it=foundTriggers.begin();it!=foundTriggers.end();++it)
  {
    std::cout << "Found in both" << (*it) << std::endl;
  }; // End for intersection

The output is then 输出是

Here is the partial output, there are over 1000 elements in the vector so I didn't include the full output: 这是部分输出,向量中有1000多个元素,因此我没有包括完整的输出:

Found:L2_te1400
Found:L2_te1600
Found:L2_te600
Found:L2_trk16_Central_Tau_IDCalib
Found:L2_trk16_Fwd_Tau_IDCalib
Found:L2_trk29_Central_Tau_IDCalib
Found:L2_trk29_Fwd_Tau_IDCalib
Found:L2_trk9_Central_Tau_IDCalib
Found:L2_trk9_Fwd_Tau_IDCalib
Found:L2_vtxbeamspot_FSTracks_L2Star_A
Found:L2_vtxbeamspot_FSTracks_L2Star_B
Found:L2_vtxbeamspot_activeTE_L2Star_A_peb
Found:L2_vtxbeamspot_activeTE_L2Star_B_peb
Found:L2_vtxbeamspot_allTE_L2Star_A_peb
Found:L2_vtxbeamspot_allTE_L2Star_B_peb
Found:L2_xe25
Found:L2_xe35
Found:L2_xe40
Found:L2_xe45
Found:L2_xe45T
Found:L2_xe55
Found:L2_xe55T
Found:L2_xe55_LArNoiseBurst
Found:L2_xe65
Found:L2_xe65_tight
Found:L2_xe75
Found:L2_xe90
Found:L2_xe90_tight
Found:L2_xe_NoCut_allL1
Found:L2_xs15
Found:L2_xs30
Found:L2_xs45
Found:L2_xs50
Found:L2_xs60
Found:L2_xs65
Found:L2_zerobias_NoAlg
Found:L2_zerobias_Overlay_NoAlg
Found this many triggers: 0

Possible Reason 可能的原因

I am starting to think that the way in which I compile my code is to blame. 我开始认为编译代码的方式应该受到指责。 I am currently compiling with ROOT (the physics data analysis framework) instead of doing a standalone compile. 我目前正在使用ROOT(物理数据分析框架)进行编译,而不是进行独立的编译。 I get the feeling that it doesn't work all that well with the STL Algorithm library and that's the cause of the issue, especially given how many people seem to have the code working for them. 我感到它在STL算法库中不能很好地工作,这就是问题的根源,特别是考虑到似乎有很多人为他们工作的代码。 I will try to do a stand-alone compilation and re-running. 我将尝试做一个独立的编译并重新运行。

Passing foundTriggers.begin() , with foundTriggers empty, as the output argument will not cause the output to be pushed onto foundTriggers . 传递foundTriggers.begin()并将foundTriggers空,因为输出参数不会导致将输出推送到foundTriggers Instead, it will increment the iterator past the end of the vector without resizing it, randomly corrupting memory. 相反,它将在不调整向量大小的情况下将迭代器增加到向量的末尾,从而随机破坏内存。

You want to use an insert iterator: 您要使用插入迭代器:

std::set_intersection(wFirst, wLast, aFirst, aLast, 
    std::back_inserter(foundTriggers));

UPDATE: As pointed out in the comments, the vector is resized to be at least large enough for the result, so your code should work. 更新:正如注释中所指出的,向量的大小已调整为至少足够大以容纳结果,因此您的代码应该可以工作。 Note that you should use the iterator returned from set_intersection to indicate the end of the intersection - your code ignores it, so you will also iterate over the empty strings left at the end of the output. 请注意,您应该使用从set_intersection返回的迭代器来指示交点的结尾-您的代码将忽略它,因此您还将迭代输出末尾的空字符串。

Could you post a complete test case so that we can see whether the intersection is actually empty or not? 您能否发布一个完整的测试用例,以便我们可以看到交集是否实际上为空?

Your allTrigers vector is empty, afterall. 毕竟,您的 allTrigers向量 空。 You never reset itr to the beginning of the map when you're filling it. 你永远不会重置 itr到地图的开始,当你加油吧。

EDIT: 编辑:

Actually, you never reset aFirst : 实际上,您永远不会重置aFirst

for(;aFirst!=aLast;++aFirst)
  {
    std::cout << "Found:" << (*aFirst) << std::endl; 
  };

  // here aFirst == aLast

  std::vector<std::string>::iterator it;

  std::sort(wFirst, wLast);
  std::sort(aFirst, aLast);  // **** sorting empty range ****
  std::set_intersection(wFirst, wLast, aFirst, aLast, back_inserter(foundTrigger));
                               //      ^^^^^^^^^^^^^^
                               // ***** empty range *****

I hope you can now see why it is good practice to narrow down the scope of your variables. 我希望您现在可以看到为什么缩小变量范围的好习惯。

You never use the return value of set_intersection . 您永远不会使用set_intersection的返回值。 In this case you could use it to resize foundIterators after set_intersection has returned, or as the upper limit of the for loop. 在这种情况下,您可以使用它在set_intersection返回之后或为for循环的上限来调整foundIterators大小。 Otherwise your code seems to work. 否则,您的代码似乎可以正常工作。 Can we see a full compilable program and its actual output please? 我们能否看到完整的可编译程序及其实际输出?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM