Storing indices in a vector<set<int>> vs a vector<vector<int>>

Question

I'm doing a problem on Leetcode called " Number of Matching Subsequences ". You are given a string S and a vector of smaller strings, and you have to find out how many of the smaller strings are substrings of S. (Not necessarily continguous substrings.)

I wrote my code in a certain way, and while it works fine, it was such that the compiler on Leetcode timed out. Someone else wrote their code almost the same as mine, but it didn't time out. I'm wondering what makes his faster. Here's mine:

class Solution {
public:
    int numMatchingSubseq(string S, vector<string>& words) {
        int count = 0;
        vector<set<int>> Svec (26); // keep track of the indices where characters were seen in S
        for (int i = 0; i < S.length(); ++i) Svec[S[i] - 'a'].insert(i);

        for (auto & w : words) { // loop over words and characters within words, finding the soonest the next character appears in S
            bool succeeded = true;
            int current_index = -1;
            for (auto & c : w) {
                set<int> & c_set = Svec[c - 'a'];
                auto it = upper_bound(begin(c_set), end(c_set), current_index);
                if (it == end(c_set)) {
                    succeeded = false;
                    break;
                }
                current_index = *it;
            } // loop over chars
            if (succeeded) count++;
        } //loop over words
        return count;
    }
};

int main() {
    string S = "cbaebabacd";
    vector<string> words {"abc", "abbd", "bbbbd"};
    Solution sol;
    cout << sol.numMatchingSubseq(S, words) << endl;
    return 0;
}

Outputs

2
Program ended with exit code: 0

His solution stores the indices not in a vector<set<int>> , but in a vector<vector<int>> . I don't see why that would be a big difference.

int numMatchingSubseq (string S, vector<string>& words) {
        vector<vector<int>> alpha (26);
        for (int i = 0; i < S.size (); ++i) alpha[S[i] - 'a'].push_back (i);
        int res = 0;

        for (const auto& word : words) {
            int x = -1;
            bool found = true;

            for (char c : word) {
                auto it = upper_bound (alpha[c - 'a'].begin (), alpha[c - 'a'].end (), x);
                if (it == alpha[c - 'a'].end ()) found = false;
                else x = *it;
            }

            if (found) res++;
        }

        return res;
    }

Answer 1

This is inefficient:

upper_bound(begin(c_set), end(c_set), current_index)

See this note in these std::upper_bound() docs :

for non-LegacyRandomAccessIterators, the number of iterator increments is linear.

You should instead use:

c_set.upper_bound(current_index)

Storing indices in a vector<set<int>> vs a vector<vector<int>>

Question

1 answers

solution1
1 2019-02-10 06:45:00

Storing indices in a vector<set<int>> vs a vector<vector<int>>

Question

1 answers

solution1 1 2019-02-10 06:45:00

solution1
1 2019-02-10 06:45:00