简体   繁体   English

C ++:C字符串集

[英]C++: set of C-strings

I want to create one so that I could check whether a certain word is in the set using set::find 我想创建一个,以便我可以使用set :: find检查某个单词是否在集合中

However, C-strings are pointers, so the set would compare them by the pointer values by default. 但是,C字符串是指针,因此默认情况下,该集合将通过指针值对它们进行比较。 To function correctly, it would have to dereference them and compare the strings. 要正常运行,必须取消引用它们并比较字符串。

I could just pass the constructor a pointer to the strcmp() function as a comparator, but this is not exactly how I want it to work. 我可以将构造函数作为比较器传递给strcmp()函数,但这并不是我想要它的工作方式。 The word I might want to check could be part of a longer string, and I don't want to create a new string due to performance concerns. 我可能要检查的单词可能是更长字符串的一部分,我不想因为性能问题而创建新字符串。 If there weren't for the set, I would use strncmp(a1, a2, 3) to check the first 3 letters. 如果没有集合,我会使用strncmp(a1,a2,3)来检查前3个字母。 In fact, 3 is probably the longest it could go, so I'm fine with having the third argument constant. 事实上,3可能是最长的,所以我可以将第三个参数保持不变。

Is there a way to construct a set that would compare its elements by calling strncmp()? 有没有办法构建一个通过调用strncmp()来比较其元素的集合? Code samples would be greatly appreciated. 我们非常感谢代码示例。

Here's pseudocode for what I want to do: 这是我想要做的伪代码:

bool WordInSet (string, set, length)
{
   for (each word in set)
    {
       if strncmp(string, word, length) == 0
            return true;
    }
    return false;
}

But I'd prefer to implement it using the standard library functions. 但我更喜欢使用标准库函数来实现它。

You could create a comparator function object. 您可以创建比较器函数对象。

struct set_object {
    bool operator()(const char* first, const char* second) {
        return strncmp(first, second, 3);
    }
};

std::set<const char*, set_object> c_string_set;

However it would be far easier and more reliable to make a set of std::strings . 但是,制作一组std::strings会更容易,也更可靠。

Make a wrapper function: 制作包装函数:

bool myCompare(const char * lhs, const char * rhs)
{
    return strncmp(lhs, rhs, 3) < 0;
}

Assuming a constant value as a word length looks like asking for trouble to me. 假设一个恒定值作为单词长度看起来像是在向我提出麻烦。 I recommend against this solution. 我建议不要这个解决方案。

Look: The strcmp solution doesn't work for you because it treats the const char* arguments as nul-terminated strings . 看: strcmp解决方案不适合你,因为它将const char*参数视为以 const char* 结尾的字符串 You want a function which does exactly the same, but treats the arguments as words - which translates to "anything-not-a-letter" -terminated string. 您需要一个完全相同的函数,但将参数视为单词 - 这将转换为“任何不是字母”的终止字符串。

One could define strcmp in a generic way as: 可以通过以下方式定义strcmp

template<typename EndPredicate>
int generic_strcmp(const char* s1, const char* s2) {
    char c1;
    char c2;
    do { 
        c1 = *s1++; 
        c2 = *s2++; 
        if (EndPredicate(c1)) {
            return c1 - c2; 
        }
    } while (c1 == c2);

    return c1 - c2; 
}

If EndPredicate is a function which returns true iff its argument is equal to \\0 , then we obtain a regular strcmp which compares 0-terminated strings. 如果EndPredicate是一个返回true的函数,如果它的参数等于\\0 ,那么我们得到一个比较0终止字符串的常规strcmp

But in order to have a function which compares words, the only required change is the predicate. 但是为了具有比较单词的功能,唯一需要的变化是谓词。 It's sufficient to use the inverted isalpha function from <cctype> header file to indicate that the string ends when a non-alphabetic character is encountered. 使用<cctype>头文件中的反转isalpha函数就足以指示当遇到非字母字符时字符串结束。

So in your case, your comparator for the set would look like this: 因此,在您的情况下,该集合的比较器将如下所示:

#include <cctype>

int wordcmp(const char* s1, const char* s2) {
    char c1;
    char c2;
    do { 
        c1 = *s1++; 
        c2 = *s2++; 
        if (!isalpha(c1)) {
            return c1 - c2; 
        }
    } while (c1 == c2);

    return c1 - c2; 
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM