简体   繁体   中英

How to check if string is a proper subset of another string

I want to check if a string is a strictly a subset of another string. For this end I used boost::contains and I compare the size of strings as follows:

#include <boost/algorithm/string.hpp>
#include <iostream>

using namespace std;
using namespace boost::algorithm;

int main()
{
  string str1 = "abc news";
  string str2 = "abc";
  //strim strings using boost
  trim(str1);
  trim(str2);
  //if str2 is a subset of str1 and its size is less than the size of str1 then it is strictly contained in str1
  if(contains(str1,str2) && (str2.size() < str1.size()))
  {
    cout <<"contains" << end;
  }
  return 0;
}

Is there a better way to solve this problem? Instead of also comparing the size of strings?


Example

  • ABC is a proper subset of ABC NEWS
  • ABC is not a proper subset of ABC

I would use the following:

bool is_substr_of(const std::string& sub, const std::string& s) {
  return sub.size() < s.size() && s.find(sub) != s.npos;
}

This uses the standard library only, and does the size check first which is cheaper than s.find(sub) != s.npos .

You can just use == or != to compare the strings:

if(contains(str1, str2) && (str1 != str2))
    ...

If string contains a string and both are not equal, you have a real subset.

If this is better than your method is for you to decide. It is less typing and very clear (IMO), but probably a little bit slower if both strings are long and equal or both start with the same, long sequence.

Note: If you really care about performance, you might want to try the Boyer-Moore search and the Boyer-Moore-Horspool search. They are way faster than any trivial string search (as apparently used in the string search in stdlibc++, see here ), I do not know if boost::contains uses them.

About Comparaison operations

TL;DR : Be sure about the format of what you're comparing.

Be wary of how you define strictly.

For example, you did not pointed out thoses issue is your question, but if i submit let's say :

 "ABC       " //IE whitespaces
 "ABC\n"

What is your take on it ? Do you accept it or not ? If you don't, you'll have to either trim or to clean your output before comparing - just a general note on comparaison operations -

Anyway, as Baum pointed out , you can either check equality of your strings using == or you can compare length (which is more efficient given that you first checked for substring) with either size() or length() ;

another approach, using only the standard library:

#include <algorithm>
#include <string>
#include <iostream>

using namespace std;

int main()
{
  string str1 = "abc news";
  string str2 = "abc";
  if (str2 != str1
    && search(begin(str1), end(str1), 
              begin(str2), end(str2)) != end(str1))
  {
    cout <<"contains" << endl;
  }
  return 0;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM