不區分大小寫的 std::string.find()

Question

我正在使用std::string的find()方法來測試一個字符串是否是另一個字符串的 substring。 現在我需要同一事物的不區分大小寫的版本。 對於字符串比較，我總是可以求助於stricmp()但似乎沒有stristr() 。

我找到了各種答案，大多數人建議使用Boost ，這在我的情況下不是一個選項。 此外，我需要支持std::wstring / wchar_t 。 有任何想法嗎？

Answer 1

您可以將std::search與自定義謂詞一起使用。

#include <locale>
#include <iostream>
#include <algorithm>
using namespace std;

// templated version of my_equal so it could work with both char and wchar_t
template<typename charT>
struct my_equal {
    my_equal( const std::locale& loc ) : loc_(loc) {}
    bool operator()(charT ch1, charT ch2) {
        return std::toupper(ch1, loc_) == std::toupper(ch2, loc_);
    }
private:
    const std::locale& loc_;
};

// find substring (case insensitive)
template<typename T>
int ci_find_substr( const T& str1, const T& str2, const std::locale& loc = std::locale() )
{
    typename T::const_iterator it = std::search( str1.begin(), str1.end(), 
        str2.begin(), str2.end(), my_equal<typename T::value_type>(loc) );
    if ( it != str1.end() ) return it - str1.begin();
    else return -1; // not found
}

int main(int arc, char *argv[]) 
{
    // string test
    std::string str1 = "FIRST HELLO";
    std::string str2 = "hello";
    int f1 = ci_find_substr( str1, str2 );

    // wstring test
    std::wstring wstr1 = L"ОПЯТЬ ПРИВЕТ";
    std::wstring wstr2 = L"привет";
    int f2 = ci_find_substr( wstr1, wstr2 );

    return 0;
}

Answer 2

新的 C++11 風格：

#include <algorithm>
#include <string>
#include <cctype>

/// Try to find in the Haystack the Needle - ignore case
bool findStringIC(const std::string & strHaystack, const std::string & strNeedle)
{
  auto it = std::search(
    strHaystack.begin(), strHaystack.end(),
    strNeedle.begin(),   strNeedle.end(),
    [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }
  );
  return (it != strHaystack.end() );
}

std::search 的解釋可以在cplusplus.com上找到。

Answer 3

為什么不使用 Boost.StringAlgo：

#include <boost/algorithm/string/find.hpp>

bool Foo()
{
   //case insensitive find

   std::string str("Hello");

   boost::iterator_range<std::string::const_iterator> rng;

   rng = boost::ifind_first(str, std::string("EL"));

   return rng;
}

Answer 4

為什么不在調用find()之前將兩個字符串都轉換為小寫？

降低

注意：

對於長字符串效率低下。
當心國際化問題。

Answer 5

由於您正在執行子字符串搜索（std::string）而不是元素（字符）搜索，不幸的是，我知道沒有現有的解決方案可以在標准庫中立即訪問以執行此操作。

不過，這很容易做到：只需將兩個字符串都轉換為大寫（或都轉換為小寫 - 我在本例中選擇了 upper）。

std::string upper_string(const std::string& str)
{
    string upper;
    transform(str.begin(), str.end(), std::back_inserter(upper), toupper);
    return upper;
}

std::string::size_type find_str_ci(const std::string& str, const std::string& substr)
{
    return upper(str).find(upper(substr) );
}

這不是一個快速的解決方案（接近悲觀領域），但它是我所知道的唯一一個現成的解決方案。 如果您擔心效率，那么實現您自己的不區分大小寫的子字符串查找器也不難。

此外，我需要支持 std::wstring/wchar_t。 有任何想法嗎？

語言環境中的 tolower/toupper 也適用於寬字符串，因此上面的解決方案應該同樣適用（簡單地將 std::string 更改為 std::wstring）。

[編輯] 正如所指出的，另一種方法是通過指定您自己的字符特征來從 basic_string 調整您自己的不區分大小寫的字符串類型。 如果您可以接受所有字符串搜索、比較等對給定字符串類型不區分大小寫，則此方法有效。

Answer 6

提供 Boost 版本也有意義：這將修改原始字符串。

#include <boost/algorithm/string.hpp>

string str1 = "hello world!!!";
string str2 = "HELLO";
boost::algorithm::to_lower(str1)
boost::algorithm::to_lower(str2)

if (str1.find(str2) != std::string::npos)
{
    // str1 contains str2
}

或使用完美的boost xpression 庫

#include <boost/xpressive/xpressive.hpp>
using namespace boost::xpressive;
....
std::string long_string( "very LonG string" );
std::string word("long");
smatch what;
sregex re = sregex::compile(word, boost::xpressive::icase);
if( regex_match( long_string, what, re ) )
{
    cout << word << " found!" << endl;
}

在這個例子中你應該注意你的搜索詞沒有任何正則表達式特殊字符。

Answer 7

如果您想根據 Unicode 和語言環境規則進行“真實”比較，請使用ICU 的Collator類。

Answer 8

#include <iostream>
using namespace std;

template <typename charT>
struct ichar {
    operator charT() const { return toupper(x); }
    charT x;
};
template <typename charT>
static basic_string<ichar<charT> > *istring(basic_string<charT> &s) { return (basic_string<ichar<charT> > *)&s; }
template <typename charT>
static ichar<charT> *istring(const charT *s) { return (ichar<charT> *)s; }

int main()
{
    string s = "The STRING";
    wstring ws = L"The WSTRING";
    cout << istring(s)->find(istring("str")) << " " << istring(ws)->find(istring(L"wstr"))  << endl;
}

有點臟，但又短又快。

Answer 9

我喜歡Kiril V. Lyadvinsky和CC的回答。 但我的問題不僅僅是不區分大小寫； 我需要一個懶惰的 Unicode 支持的命令行參數解析器，它可以在處理字母數字字符串搜索時消除誤報/否定，這些搜索可能在基本字符串中包含特殊字符，用於格式化我正在搜索的字母數字關鍵字，例如， Wolfjäger應該' t 匹配jäger但<jäger>應該匹配。

它基本上只是 Kiril/CC 的答案，對字母數字精確長度匹配進行了額外處理。

/* Undefined behavior when a non-alpha-num substring parameter is used. */
bool find_alphanum_string_CI(const std::wstring& baseString, const std::wstring& subString)
{
    /* Fail fast if the base string was smaller than what we're looking for */
    if (subString.length() > baseString.length()) 
        return false;

    auto it = std::search(
        baseString.begin(), baseString.end(), subString.begin(), subString.end(),
        [](char ch1, char ch2)
        {
            return std::toupper(ch1) == std::toupper(ch2);
        }
    );

    if(it == baseString.end())
        return false;

    size_t match_start_offset = it - baseString.begin();

    std::wstring match_start = baseString.substr(match_start_offset, std::wstring::npos);

    /* Typical special characters and whitespace to split the substring up. */
    size_t match_end_pos = match_start.find_first_of(L" ,<.>;:/?\'\"[{]}=+-_)(*&^%$#@!~`");

    /* Pass fast if the remainder of the base string where
       the match started is the same length as the substring. */
    if (match_end_pos == std::wstring::npos && match_start.length() == subString.length()) 
        return true;

    std::wstring extracted_match = match_start.substr(0, match_end_pos);

    return (extracted_match.length() == subString.length());
}

Answer 10

最有效的方法

簡單快捷。

性能保證是線性的，初始化成本為 2 * NEEDLE_LEN 比較。 (glic)

#include <cstring>
#include <string>
#include <iostream>

int main(void) {

    std::string s1{"abc de fGH"};
    std::string s2{"DE"};

    auto pos = strcasestr(s1.c_str(), s2.c_str());

    if(pos != nullptr)
        std::cout << pos - s1.c_str() << std::endl;

    return 0;
}

Answer 11

wxWidgets 有非常豐富的字符串 API wxString

可以用（使用大小寫轉換方式）

int Contains(const wxString& SpecProgramName, const wxString& str)
{
  wxString SpecProgramName_ = SpecProgramName.Upper();
  wxString str_ = str.Upper();
  int found = SpecProgramName.Find(str_);
  if (wxNOT_FOUND == found)
  {
    return 0;
  }
  return 1;
}

不區分大小寫的 std::string.find()

問題描述

11 個解決方案

解決方案1
81 已采納 2010-06-30 18:35:33

解決方案2
62 2013-11-07 15:08:04

解決方案3
19 2014-11-04 11:22:06

解決方案4
17 2010-06-30 18:34:18

解決方案5
8 2010-06-30 18:41:51

解決方案6
2 2013-12-31 12:15:09

解決方案7
2 2010-06-30 18:58:30

解決方案8
0 2015-08-06 10:49:56

解決方案9
0 2018-06-27 23:34:38

解決方案10
0 2022-12-05 14:57:21

解決方案11
-2 2019-12-24 07:06:44

不區分大小寫的 std::string.find()

問題描述

11 個解決方案

解決方案1 81 已采納 2010-06-30 18:35:33

解決方案2 62 2013-11-07 15:08:04

解決方案3 19 2014-11-04 11:22:06

解決方案4 17 2010-06-30 18:34:18

解決方案5 8 2010-06-30 18:41:51

解決方案6 2 2013-12-31 12:15:09

解決方案7 2 2010-06-30 18:58:30

解決方案8 0 2015-08-06 10:49:56

解決方案9 0 2018-06-27 23:34:38

解決方案10 0 2022-12-05 14:57:21

解決方案11 -2 2019-12-24 07:06:44

解決方案1
81 已采納 2010-06-30 18:35:33

解決方案2
62 2013-11-07 15:08:04

解決方案3
19 2014-11-04 11:22:06

解決方案4
17 2010-06-30 18:34:18

解決方案5
8 2010-06-30 18:41:51

解決方案6
2 2013-12-31 12:15:09

解決方案7
2 2010-06-30 18:58:30

解決方案8
0 2015-08-06 10:49:56

解決方案9
0 2018-06-27 23:34:38

解決方案10
0 2022-12-05 14:57:21

解決方案11
-2 2019-12-24 07:06:44