简体   繁体   中英

string.IsNullOrEmpty & string.IsNullOrWhiteSpace return false for empty string

I have run into a curious case where a block of code that is designed to weed out blank strings, periods dashes and the like after a paragraph of text is processed from the MSFT Azure Phrase Breaker. I could use help figuring out how to debug the issue.

The following block of code returns true when given a value of "" . Obviously the expectation is the method should return false after the first if statement. Out of 899 phrases to be looked at, only two seem to have this problem. Happens on another machine as well.

public static bool IsSentenceTranslatable(string sentence)
{
    string trimmedSentence = sentence.Trim();

    if (string.IsNullOrEmpty(trimmedSentence) || string.IsNullOrWhiteSpace(trimmedSentence))
    {
        return false;
    }

    switch (trimmedSentence)
    {
        case " ":
        case ".":
        case "-":
        case "·":
            return false;
    }

    return true;
}

Here is a snapshot of the debugger.

调试器快照

Could it be a bug in Visual Studio or .NET? I tried using the various visualizers, but still couldn't see anything in the box. .NET 4.5 C# 7.3

Try to get the string's byte representation. I suspect that it actually contains some character which is invisible in VS debugger but doesn't count as a whitespace.

See this questions for hints:

UPD: since your Watch window shows that after the call string trimmedSentence = sentence.Trim() you have trimmedSentence.Length == 1 , I'd upgrade my suspicion to certainty.

As stated in my comment, in that screenshot you can see that trimmedSentence.Length is 1, therefore it's not empty, and its contents is definitely not a standard space. If the string appears empty, it's because it has one of those so-called invisible characters. To check what your string has, you can directly access that character by doing trimmedSentence[0] .

If that character will appear often, you might want to consider doing this:

string trimmedSentence = sentence.Trim().Replace("<this special character>", "");

Alternatively, you can create that replaceable string from the Unicode value by doing Char.ConvertFromUtf32(yourCharCode).ToString() . You cannot use the Replace overload that uses character parameters, as there is no "empty" character, only an empty string. You should be able to get this value while debugging. Or if necessary, by doing (int)trimmedSentence[0] .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM