简体   繁体   中英

string.ToLowerInvariant() in C# vs String.ToLowerCase(Culture.ROOT) in Java for Turkish İ

I see a difference in behavior between C# (.NET v4.0) and Java for converting 'İ' to lowercase with "invariant" culture.

In Java, "İ".toLowerCase(Locale.ROOT) returns 'i' .

In C#, "İ".ToLowerInvariant() and "İ".ToLower(CultureInfo.InvariantCulture) both return "İ" but "İ".ToLower(new CultureInfo("en-EN")) returns 'i' .

Looks like Java is doing the conversion correctly but C# is not. Is this a bug in C#?

Let's have a look. The letter of the question

İ

is in fact

U + 0130: Latin Capital Letter I With Dot Above

( Character Map quotation). It seems reasonable, IMHO, that in case of Invariant Culture (we have no right to use any culture either English or Turkish) ToUpperInvariant() method should return the letter itself (since it's capital already) and for ToLowerInvariant the result should be something like

U + xxxx: Latin Small Letter I With Dot Above

However, we don't have such a letter:

https://en.wikipedia.org/wiki/Dotted_and_dotless_I

And since we don't have the letter required, all we can do is to leave the original one intact .

When we use, say "en-EN" ( English ) culture we have a right to correspond Letter I With Dot Above to just good old English I and thus return i for ToLower() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM