简体   繁体   English

如何在String.ToUpper()中忽略撇号?

[英]How To ignore apostrophe in String.ToUpper()?

In french lot of City have apostrophe in their name. 在法国很多城市都有撇号的名字。 Like "rue de l'église" 喜欢“rue de l'église”

We Use a converter to Write it in Full UpperCase in almost every UI Part. 我们几乎在每个UI部件中使用转换器将其写入Full UpperCase。

But string.ToUpper seem to have a bug because we get "RUE DE L'église" instead of the "RUE DE L'ÉGLISE" we are supposed to get. 但是string.ToUpper似乎有一个bug,因为我们得到的是“RUE DE L'église”,而不是我们应该得到的“RUE DE L'ÉGLISE”。

Can you explain why? 你能解释一下原因吗? anyway to get the expected result? 无论如何要获得预期的结果?

My converter look like this 我的转换器看起来像这样

    public object Convert(object value, Type targetType, object parameter, CultureInfo culture)
    {
        if (value != null)
        {
            var res = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(value.ToString().ToUpper());
            return res;
        }

        return String.Empty;
    }

You've probably hit a cornercase that wasn't considered, or the behavior is "correct". 你可能遇到了一个没有考虑的角落,或者行为是“正确的”。

The documentation of TextInfo.ToTitleCase states: TextInfo.ToTitleCase的文档说明:

Converts the specified string to title case ( except for words that are entirely in uppercase, which are considered to be acronyms ). 将指定的字符串转换为标题大小写( 完全大写的单词除外,这些单词被认为是首字母缩略词 )。

(my emphasis) (我的重点)

The code probably doesn't consider non-letter characters, so the presence of the apostrophe makes this a word that isn't all uppercase, and thus the letters after the first is converted to lowercase. 代码可能不考虑非字母字符,因此撇号的存在使得这个单词不是全部大写,因此第一个单词后面的字母被转换为小写。

The question is, isn't this behavior correct? 问题是,这种行为不正确吗? The presence of the apostrophe means this is not an acronym, and thus it shouldn't follow the rule that the all-uppercase words (acronyms) follow. 撇号的存在意味着这不是首字母缩略词,因此它不应遵循全大写单词(首字母缩略词)遵循的规则。 The correct behavior for non-acronym words is that the first letter gets to be uppercase, the rest lowercase (regardless of their current state). 非首字母缩写词的正确行为是第一个字母变为大写,其余为小写(无论其当前状态如何)。

Regardless of this, there is additional documentation further down on the same page: 无论如何,在同一页面上还有其他文档:

As illustrated above, the ToTitleCase method provides an arbitrary casing behavior which is not necessarily linguistically correct. 如上所述,ToTitleCase方法提供了任意的套管行为,其在语言上不一定正确。 A linguistically correct solution would require additional rules, and the current algorithm is somewhat simpler and faster. 语言上正确的解决方案需要额外的规则,并且当前算法更简单,更快速。 We reserve the right to make this API slower in the future. 我们保留在未来使此API速度变慢的权利。

Which means they've actually documented that it doesn't necessarily do exactly what people want, only provide a good-enough(tm) approach to the problem. 这意味着他们实际上已经记录了它并不一定完全符合人们的需求,只提供了一个足够好的(tm)方法来解决问题。

In light of all this I'd say the method behaves exactly as documented. 鉴于这一切,我认为该方法的行为完全符合记录。

ToTitleCase() does not do what you want. ToTitleCase()没有做你想要的。 It capitalizes the first character of each word. 它将每个单词的第一个字符大写。 What you want is just plain string.ToUpper() : 你想要的只是普通的string.ToUpper()

Console.WriteLine("rue de l'église".ToUpper());

Output: 输出:

RUE DE L'ÉGLISE

ToTitleCase() : ToTitleCase()

Console.WriteLine(CultureInfo.GetCultureInfo("fr-fr").TextInfo.ToTitleCase("rue de l'église"));

Output 产量

Rue De L'église

Combining ToTitleCase() and ToUpper() causes this weird behavior that you describe, since ToTitleCase() tries to lowercase every other character than the first (except for words that are all uppercase and considered acronyms, according to the documentation ) 组合ToTitleCase()ToUpper()会导致您描述的这种奇怪行为,因为ToTitleCase()尝试将每个其他字符小写为低于第一个字符(根据文档 ,除了全部大写且被视为首字母缩略词的单词之外)

May be the issue is in your "CurrentCulture" or the "ToTitleCase"? 问题可能出在你的“CurrentCulture”或“ToTitleCase”中吗?

Tell me your "CultureInfo" identifier ( System.Globalization.CultureInfo.CurrentCulture.ToString() ), so I could investigave more. 告诉我你的“CultureInfo”标识符( System.Globalization.CultureInfo.CurrentCulture.ToString() ),所以我可以调查更多。

I have tried in VB and I have the same "problem". 我在VB中试过,我也有同样的“问题”。

The problem is linked to ToTitleCase() function because ToUpper() function work well. 问题与ToTitleCase()函数有关,因为ToUpper()函数运行良好。

I have tried in adding "chrétien" just after "église" 我试过在“église”之后添加“chrétien”

Dim s = "Rue de l'église chrétienne".ToUpper()
Dim res = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(s)

res variable contains "RUE DE L'église CHRÉTIENNE" res变量包含“RUE DE L'égliseCHRÉTIENNE”

You can see that the 'é' of "église" has not been converted but 'é' of "chrétien" has been converter to upper case ! 您可以看到“église”的'é'尚未转换,但“chrétien”的'é'已被转换为大写!

The s variable contains "RUE DE L'ÉGLISE CHRÉTIENNE" s变量包含“RUE DE L'ÉGLISECHRÉTIENNE”

If I replace "église" by "eglise" (without accentued character), the res variable contains "RUE DE L'eglise CHRÉTIENNE" 如果我用“eglise”替换“église”(没有重音字符), res变量包含“RUE DE L'egliseCHRÉTIENNE”

We can see that the 'é' character has no impact on the conversion. 我们可以看到'é'字符对转换没有影响。

My Regional Setting is FR-FR. 我的区域设置是FR-FR。

I think that it is a bug because Microsoft doesn't respect correctly french language in which simple quote is part of language. 我认为这是一个错误,因为微软不正确地尊重法语,其中简单引用是语言的一部分。

In waiting a Microsoft solution, you can implement following workaround : 在等待Microsoft解决方案时,您可以实施以下解决方法:

Dim res
    = CultureInfo
        .CurrentCulture
            .TextInfo
                .ToTitleCase(s.replace("'","--"))
                    .replace("--","'")

In fact, if what you will is converting in TitleCase, you must remove the conversion to UpperCase. 事实上,如果您要在TitleCase中进行转换,则必须删除转换为UpperCase。

The correct code would be 正确的代码是

Dim s = "Rue de l--église chrétienne de l--hiver"
Dim res = CultureInfo.CurrentCulture.TextInfo
  .ToTitleCase(s.Replace("'", "--")).Replace("--", "'")

and res varaible contains "Rue De L'Église Chrétienne De L'Hiver" ! res varaible包含“Rue De L'ÉgliseChrétienneDeL'Hiver”!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM