简体   繁体   English

按字典顺序比较两个字符串中的第 k 个单词

[英]to lexicographically compare kth word from two strings

I m trying to write a c++ function to lexicographically compare kth word from two strings.我正在尝试编写 c++ function 来按字典顺序比较两个字符串中的第 k 个单词。 here is my function:这是我的 function:

bool kth_lexo ()
{
    int k = 2 ;
    str1 = "123 300 60009" ;
    str2 = "1500 10002" ;

// to store the kth word of fist string in ptr1
    char *ptr1 = strtok( (char*)str1.c_str() ," "); 
    for(int i = 1; i<k; i++)
    {
        ptr1 = strtok(NULL," ");
    }

// to store the kth word of second string in ptr2
    char *ptr2 = strtok( (char*)str2.c_str() ," "); 
    for(int i = 1; i<k; i++)
    {
        ptr2 = strtok(NULL," ");
    }

    string st1 = ptr1 ;
    string st2 = ptr2 ;
    return st1 > st2 ;


} 

In this function my lexicographical comparison works fine, as this func returns 1 because 300 (2nd word of str1) is lexicographically bigger than 10002 (2nd word of str2)在这个 function 中,我的字典比较工作正常,因为这个函数返回 1,因为 300(str1 的第二个字)在字典上大于 10002(str2 的第二个字)

My Problem: If i slightly modify my function by replacing last line of previous function by this return ptr1>ptr2;我的问题:如果我通过用这个return ptr1>ptr2;

now my new function lokks something like this:现在我的新 function lokks 是这样的:

bool kth_lexo ()
{
    int k = 2 ;
    str1 = "123 300 60009" ;
    str2 = "1500 10002" ;

// to store the kth word of fist string in ptr1
    char *ptr1 = strtok( (char*)str1.c_str() ," "); 
    for(int i = 1; i<k; i++)
    {
        ptr1 = strtok(NULL," ");
    }

// to store the kth word of second string in ptr2
    char *ptr2 = strtok( (char*)str2.c_str() ," "); 
    for(int i = 1; i<k; i++)
    {
        ptr2 = strtok(NULL," ");
    }

// modified line compared to previous function
    return ptr1 > ptr2 ;


}

for this modified function each time my output consistently comes out to be 0, no matter whether kth word of str1 stored in ptr1 is lexicographically greater or smaller than kth word of str2 stored in ptr2.对于这个修改过的 function 每次我的 output 始终为 0,无论存储在 ptr1 中的 str1 的第 k 个字在字典上是否大于或小于存储在 ptr2 中的 str2 的第 k 个字。

also even after modifying the return statement by this line doesn't bring much help: return (*ptr1)>(*ptr2);即使通过这一行修改了 return 语句也没有带来太多帮助: return (*ptr1)>(*ptr2);

So what's the problem with either of these two return statement lines in my modified function for comparing the kth word of both the strings:那么在我修改后的 function 中用于比较两个字符串的第 k 个单词的这两个返回语句行中的任何一个有什么问题:

return ptr1 > ptr2;

OR或者

return (*ptr1) > (*ptr2);

You are using a very C-like program.您正在使用一个非常类似于 C 的程序。 Using modern C++ makes this much simpler and easier to read, since we can use very expressive syntax:使用现代 C++ 使这变得更简单、更容易阅读,因为我们可以使用非常富有表现力的语法:

#include <string_view>
#include <iostream>
#include <cassert>

auto find_kth_char(std::string_view to_search, char c, std::size_t k, std::size_t pos = 0) {
    for (; pos < std::string_view::npos && k > 0; --k) {
        pos = to_search.find(c, pos + 1);
    }
    return pos;
}

auto get_kth_word(std::string_view to_search, std::size_t k) {
    // We count starting on 1
    assert(k > 0);
    auto start = find_kth_char(to_search, ' ', k - 1);
    if (start == std::string_view::npos) {
        return std::string_view{};
    }

    auto end = find_kth_char(to_search, ' ', 1, start);

    return to_search.substr(start, end - start);
}

auto compare_kth(std::string_view lhs, std::string_view rhs, std::size_t k) {
    auto l_word = get_kth_word(lhs, k);
    auto r_word = get_kth_word(rhs, k);

    // returnvalue <=> 0 == lhs <=> rhs
    return l_word.compare(r_word);
}

int main() {
    auto str1 = "123 300 60009";
    auto str2 = "1500 10002";

    for (std::size_t k = 1; k < 4; ++k) {
        std::cout << k << ":\t" << compare_kth(str1, str2, k) << '\n';
    }
}

I am using C++17's string_view since we do not change anything in the strings and taking substrings etc. is very cheap with them.我正在使用 C++17 的string_view ,因为我们不会更改字符串中的任何内容,并且使用子字符串等非常便宜。 We use thefind and compare member functions for doing the real work.我们使用findcompare成员函数来完成实际工作。

The return value from our function is an int that tells us whether the left hand side is smaller (negative result), equal (0) or greater (positve result) than the right hand side.我们的 function 的返回值是一个 int,它告诉我们左侧是否比右侧更小(负结果)、等于 (0) 或更大(正结果)。

If you would stop using C and consequently use C++, then this problem would not occur.如果您停止使用 C 并因此使用 C++,则不会出现此问题。

You are here mixing up C++ std::string and char* or const char* .您在这里混淆了 C++ std::stringchar*const char* Basically, for strings, std::string is that superior to the old style C-char-arrays or char* that you from now on and in the future should never use something else than std::string基本上,对于字符串, std::string优于旧式 C-char-arrays 或char* ,你从现在和将来永远不应该使用std::string以外的东西

A char pointer is an adress into some area in the memory, where your char data is stored. char 指针是指向 memory 中某个区域的地址,您的 char 数据存储在其中。 Dereferencing the pointer with *, will give you the element stored at this address.使用 * 取消引用指针将为您提供存储在此地址的元素。 So only one element.所以只有一个元素。 Not a string or whatever.不是字符串或其他任何东西。 Only exactly one character.只有一个字符。

comparing ptr1 > ptr2 , will not compare strings.比较ptr1 > ptr2 ,不会比较字符串。 It will compare some values, where the strings are stored in memory.它将比较一些值,其中字符串存储在 memory 中。 "ptr1" could be 0x578962574 and "ptr2" could be 0x95324782, or whatever. “ptr1”可以是 0x578962574,“ptr2”可以是 0x95324782,或者其他。 We do not know the address.我们不知道地址。 This will be defined by the linker.这将由 linker 定义。

And if you compare (*ptr1)>(*ptr2) , then you compare only 2 singgle characters, and that may give you also the wrong result.如果你比较(*ptr1)>(*ptr2) ,那么你只比较 2 个单个字符,这也可能给你错误的结果。

On the other hand, Comparing 2 std::string s, will always work as expected.另一方面,比较 2 个std::string将始终按预期工作。

So, simple answer: Use std::string for all strings.所以,简单的答案:对所有字符串使用std::string

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM