简体   繁体   中英

Matching of strings with special characters

I need to generate a string that can match another both containing special characters. I wrote what I thought would be a simple method, but so far nothing has given me a successful match.

I know that specials characters in c++ are preceded with a "\\". Per example a single quote would be written as "\\'".

string json_string(const string& incoming_str)
{
    string str = "\\\"" + incoming_str + "\\\"";
    return str;
}

And this is the string I have to compare to:

bool comp = json_string("hello world") == "\"hello world\"";

I can see in the cout stream that in fact I'm generating the string as needed but the comparison still gives a false value.

What am I missing? Any help would be appreciated.

One way is to filter one string and compare this filtered string. For example:

#include <iostream>
#include <algorithm>

using namespace std;

std::string filterBy(std::string unfiltered, std::string specialChars)
{
    std::string filtered;

    std::copy_if(unfiltered.begin(), unfiltered.end(),
              std::back_inserter(filtered), [&specialChars](char c){return specialChars.find(c) == -1;});

    return filtered;
}

int main() {
    std::string specialChars = "\"";
    std::string string1 = "test";
    std::string string2 = "\"test\"";

    std::cout << (string1 == filterBy(string2, specialChars) ? "match" : "no match");

    return 0;
}

Output is match . This code also works if you add an arbitrary number of characters to specialChars .

If both strings contain special characters, you can also put string1 through the filterBy function. Then, something like:

"\"hello \" world \"" == "\"hello world "

will also match.

If the comparison is performance-critical, you might also have a comparison that uses two iterators, getting a comparison complexity of log(N+M), where N and M are the sizes of the two strings, respectively.

bool comp = json_string("hello world") == "\"hello world\"";

This will definitely yield false. You are creating string \\"hello world\\" by json_string("hello world") but comparing it to "hello world"

The problem is here:

 string str = "\\\"" + incoming_str + "\\\"";

In your first string literal of str, the first character backlash that you're assuming to be treated like escape character is not actually being treated an escape character, rather just a backslash in your string literal. You do the same in your last string literal.

Do this :

string str = "\"" + incoming_str + "\"";

In C++ string literals are delimited by quotes.

Then the problem arises: How can I define a string literal that does itself contain quotes? In Python (for comparison), this can get easy (but there are other drawbacks with this approach not of interest here): 'a string with " (quote)' .

C++ doesn't have this alternative string representation 1 , instead, you are limited to using escape sequences (which are available in Python, too – just for completeness...): Within a string (or character) literal (but nowhere else!), the sequence \\" will be replaced by a single quote in the resulting string.

So "\\"hello world\\"" defined as character array would be:

{ '"', 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '"', 0 };

Note that now the escape character is not necessary...

Within your json_string function, you append additional backslashes, though:

"\\\""
{ '\', '"', 0 }
//^^^

Note that I wrote '\\' just for illustration! How would you define single quote? By escaping again! '\\'' – but now you need to escape the escape character, too, so a single backslash actually needs to be written as '\\\\' here (wheras in comparison, you don't have to escape the single quote in a string literal: "i am 'singly quoted'" – just as you didn't have to escape the double quote in the character literal).

As JSON uses double quotes for strings, too, you'd most likely want to change your function:

return "\"" + incoming_str + "\"";

or even much simpler:

return '"' + incoming_str + '"';

Now

json_string("hello world") == "\"hello world\""

would yield true...

1 Side note (stolen from answer deleted in the meanwhile): Since C++11, there are raw string literals , too. Using these, you don't have to escape either.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM