简体   繁体   中英

Most efficient way to split a vector into several

I have the following code which break up vectorOfInterest into smaller blocks to send out. this code is working.

However, I do a copy when I split the vectorOfInterest into smaller chunks (in the constructor of subList and remainder). is there a better to use move instead of duplicating the data again for better performance?

note that i cannot change the argument of OTHERCLASS::doSend()

Edit: i am using C++98

int blockSize = 50;
vector <CLASS_T> vectorOfInterest; 

// ...<populates vectorOfInterest>
do {
    if(vectorOfInterest.size()> blockSize)
        vector<CLASS_T>iterator from = vectorOfInterest.begin();
        vector<CLASS_T>iterator to = from + blockSize;

        //elements are copied again in subList and remainder
        //I like to move the elements from vectorOfInterest instead.
        vector<CLASS_T> subList (from, to);  
        vector<CLASS_T> remainder (to, vectorOfInterest.end());
        vectorOfInterest.swap(remainder);

        OTHERCLASS::doSend (subList); // method which sends sublists in blocks of exactly 50 to external library
    }else {
        //pad to exactly size 50 
        vectorOfInterest.resize(blockSize);

         OTHERCLASS::dosend (vectorOfInterest); // method which sends sublists in blocks of exactly 50 to external library

        vectorOfInterest.clear();
    }

while ( !vectorOfInterest.empty());

You shouldn't be erasing elements from vectorOfInterest every iteration. That involves a lot of unnecessary copying. Instead, keep a persistent iterator. You can also avoid doing an allocation of the sublist every iteration.

vector<CLASS_T>::iterator from = vectorOfInterest.begin();
vector<CLASS_T> subList;

do {
    if(vectorOfInterest.end() - from > blockSize) {    
        subList.assign(from, from + blockSize);
        from += blockSize;    
        OTHERCLASS::doSend(subList);
    }else {            
        subList.assign(from, vectorOfInterest.end());
        subList.resize(blockSize);    
        OTHERCLASS::dosend (subList);    
        vectorOfInterest.clear();
        subList.clear();
    }    
} while ( !vectorOfInterest.empty());

If you do not need a real copy and your original vector remains unchanged and accessible, you could send appropriate iterator ranges boost::iterator_range<std::vector<CLASS_T>::iterator> to the library. For this approach, I assume the other class to take a template argument that behaves like a container.

A more important optimisation might be the excessive copying done when you create remainder.

try this:

int blockSize = 50;
vector <CLASS_T> vectorOfInterest; 
vector <CLASS_T> subList(blockSize);

size_t vectorSize = vectorOfInterest.size();

for(int i = 0; i < vectorSize(); i += blockSize)
{
    vector<CLASS_T>iterator from = vectorOfInterest.begin() + i;
    size_t thisBlockSize = min(i+blockSize, vectorSize);
    vector<CLASS_T>iterator to = vectorOfInterest.begin() + thisBlockSize;

    //replace this with an elementwise swap if you can
    std::copy(from, to, subList.begin());
    std::fill(to, vectorOfInterest.end(), /*what ever you want*/);

    OTHERCLASS::doSend (subList);
}

Edit: I just saw the C++98 part. Moving is out of question. In otherwords, you are allocating a new vector twice in your loop, and copying each element twice. By prealloating a vector above the loop you are only allocating one vector (the one that doSend recieves). To avoid copying elements twice is harder. Since you can invalidate the vector's elements, swapping and then copying is possible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM