I have an app written in C# that does a lot of string comparison. The strings are pulled in from a variety of sources (including user input) and are then compared. However I'm running into problems when comparing space '32' to non-breaking space '160'. To the user they look the same and so they expect a match. But when the app does the compare, there is no match.
What is the best way to go about this? Am I going to have to go to all parts of the code that do a string compare and manually normalize non-breaking spaces to spaces? Does .NET offer anything to help with that? (I've tried all the compare options but none seem to help.)
It has been suggested that I normalize the strings upon receipt and then let the string compare method simply compare the normalized strings. I'm not sure it would be straight-forward to do that because what is a normalized string in the first place. What do I normalize it too? Sure, for now I can convert non-breaking spaces to breaking spaces. But what else can show up? Can there potentially be very many of these rules? Might they even be conflicting. (In one case I want to use a rule and in another I don't.)
I went through lots of pain to find this simple answer. The code below uses a regular expression to replace non breaking spaces with normal spaces.
string cellText = "String with non breaking spaces.";
cellText = Regex.Replace(cellText, @"\u00A0", " ");
Hope this helps, Dan
If it were me, I would 'normalize' the strings as I 'pulled them in'; probably with a string.Replace(). Then you won't need to change your comparisons anywhere else.
Edit : Mark, that's a tough one. Its really up to you, or you clients, as to what is a 'normalized' string. I've been in a similar situation where the customer demanded that strings like:
I have 4 apples. I have four apples.
were actually equal. You may need separate normalizers for different situations. Either way, I would still do the normalization upon retrieval of the original strings.
It needs to be
text.Replace('\u00A0',' ')
where \
is non breaking space
This will replace the non breaking space with normal space.
I'd suggest creating your own string comparer that extends one of the original ones -- do the "normalization" there (replace non-breaking space with regular space). In addition to the instance Equals
method, there's a static String.Equals
that takes a comparer.
不使用正则表达式的情况也一样,主要是我自己以后需要时使用:
text.Replace('\ ', ' ')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.