[英]For loop slows down
I'm writing a program that loops through a vector of documents (specific type, pointed by m_docs). 我正在写一个循环遍历文档向量(特定类型,由m_docs指向)的程序。 Each doc has an attribute which is a vector of ~17000 zeros that are changed on some occasions (the point of the loop).
每个文档都有一个属性,该属性是〜17000个零的向量,在某些情况下(循环点)会更改这些零。 I have ~3200 docs.
我有3200个文档。 My problem is that the first hundred docs are processed rather quickly, and then it really slows down.
我的问题是,前一百个文档处理得相当快,然后却真的变慢了。 I would like to understand why it slows down, and to know how I could fix it (or at least optimize it)
我想了解为什么它会变慢,并想知道如何解决(或至少对其进行优化)
Portion of code in question: 有问题的代码部分:
for (int k = 0; k < m_docs->size(); k++) {
int pos;
std::map<std::string, std::vector<std::pair<int, int> > >::iterator it = m_index.begin();
std::map<string,int> cleanList = (*m_docs)[k].getCleantList();
for (auto const& p : cleanList) {
pos = distance(it, m_index.find(p.first));
float weight = computeIdf(p.first) * computeTf(p.first, (*m_docs)[k]);
(*m_docs)[k].setCoord(pos, weight);
}
}
This could be more efficient: 这可能会更有效:
std::map<string,int> cleanList
into 成
std::map<string,int> const& cleanList
Worst case, getCleantList
already made the copy, and you get a temp bound to a const& (which is fine). 最坏的情况是,
getCleantList
已经完成了复制,并且您获得了与const&绑定的临时文件(可以)。 But way more likely, you decimate memory allocations because you're no longer copying maps containing strings 但是更有可能的是,您减少了内存分配,因为您不再复制包含字符串的映射
Also, look at the efficiency of the search here: 另外,在这里查看搜索的效率:
pos = distance(it, m_index.find(p.first));
You called the variable m_index
. 您调用了变量
m_index
。 You might need to improve locality (flat_map) or use a hash based container (unordered_map eg) 您可能需要改善局部性(flat_map)或使用基于哈希的容器(例如unordered_map)
Review your data structures (at the very least for the m_index
) 查看您的数据结构(至少对于
m_index
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.