简体   繁体   English

C#中char到int的隐式类型转换

[英]Implicit type cast of char to int in C#

I have a question about the implicit type conversion我有一个关于隐式类型转换的问题

Why does this implicit type conversion work in C#?为什么这种隐式类型转换在 C# 中有效? I've learned that implicit code usually don't work.我了解到隐式代码通常不起作用。

I have a code sample here about implicit type conversion我这里有一个关于隐式类型转换的代码示例

 char c = 'a';
 int x = c;
 int n = 5;
 int answer = n * c;
 Console.WriteLine(answer);

UPDATE: I am using this question as the subject of my blog today.更新:我今天使用这个问题作为我博客的主题。 Thanks for the great question.谢谢你的好问题。 Please see the blog for future additions, updates, comments, and so on.请参阅博客以了解未来的添加、更新、评论等。

http://blogs.msdn.com/ericlippert/archive/2009/10/01/why-does-char-convert-implicitly-to-ushort-but-not-vice-versa.aspx http://blogs.msdn.com/ericlippert/archive/2009/10/01/why-does-char-convert-implicitly-to-ushort-but-not-vice-versa.aspx


It is not entirely clear to me what exactly you are asking.我不完全清楚你到底在问什么。 "Why" questions are difficult to answer. “为什么”的问题很难回答。 But I'll take a shot at it.但我会试一试。

First, code which has an implicit conversion from char to int (note: this is not an "implicit cast", this is an "implicit conversion") is legal because the C# specification clearly states that there is an implicit conversion from char to int, and the compiler is, in this respect, a correct implementation of the specification.首先,具有从 char 到 int 的隐式转换(注意:这不是“隐式转换”,这是一个“隐式转换”)的代码是合法的,因为 C# 规范明确指出存在从 char 到 int 的隐式转换,在这方面,编译器是规范的正确实现。

Now, you might sensibly point out that the question has been thoroughly begged.现在,您可能明智地指出,这个问题已经被彻底地恳求了。 Why is there an implicit conversion from char to int?为什么有从 char 到 int 的隐式转换? Why did the designers of the language believe that this was a sensible rule to add to the language?为什么语言的设计者认为这是添加到语言中的合理规则?

Well, first off, the obvious things which would prevent this from being a rule of the language do not apply.嗯,首先,明显的事情会阻止这成为语言的规则并不适用。 A char is implemented as an unsigned 16 bit integer that represents a character in a UTF-16 encoding, so it can be converted to a ushort without loss of precision, or, for that matter, without change of representation. char 被实现为一个无符号的 16 位整数,它表示一个 UTF-16 编码中的字符,因此它可以转换为 ushort 而不会损失精度,或者就此而言,不会改变表示形式。 The runtime simply goes from treating this bit pattern as a char to treating the same bit pattern as a ushort.运行时只是从将此位模式视为 char 到将相同的位模式视为 ushort。

It is therefore possible to allow a conversion from char to ushort.因此,可以允许从 char 转换为 ushort。 Now, just because something is possible does not mean it is a good idea.现在,仅仅因为某些事情是可能的并不意味着它是一个好主意。 Clearly the designers of the language thought that implicitly converting char to ushort was a good idea, but implicitly converting ushort to char is not.显然,该语言的设计者认为将 char 隐式转换为 ushort 是个好主意,但将 ushort 隐式转换为 char 则不然。 (And since char to ushort is a good idea, it seems reasonable that char-to-anything-that-ushort-goes-to is also reasonable, hence, char to int. Also, I hope that it is clear why allowing explicit casting of ushort to char is sensible; your question is about implicit conversions.) (并且由于 char 到 ushort 是一个好主意,因此 char 到 ushort 去的任何东西也是合理的似乎是合理的,因此,char 到 int。另外,我希望很清楚为什么允许显式转换ushort 到 char 是明智的;您的问题是关于隐式转换。)

So we actually have two related questions here: First, why is it a bad idea to allow implicit conversions from ushort/short/byte/sbyte to char?所以我们实际上在这里有两个相关的问题:首先,为什么允许从 ushort/short/byte/sbyte 到 char 的隐式转换是一个坏主意? and second, why is it a good idea to allow implicit conversions from char to ushort?其次,为什么允许从 char 到 ushort 的隐式转换是个好主意?

Unlike you, I have the original notes from the language design team at my disposal.与您不同,我可以使用语言设计团队的原始笔记。 Digging through those, we discover some interesting facts.通过挖掘这些,我们发现了一些有趣的事实。

The first question is covered in the notes from April 14th, 1999, where the question of whether it should be legal to convert from byte to char arises.第一个问题包含在 1999 年 4 月 14 日的注释中,其中出现了从字节转换为字符是否合法的问题。 In the original pre-release version of C#, this was legal for a brief time.在 C# 的原始预发布版本中,这在短时间内是合法的。 I've lightly edited the notes to make them clear without an understanding of 1999-era pre-release Microsoft code names.在不了解 1999 年预发行版 Microsoft 代号的情况下,我对这些注释进行了轻微编辑以使其清晰。 I've also added emphasis on important points:我还强调了重点:

[The language design committee] has chosen to provide an implicit conversion from bytes to chars, since the domain of one is completely contained by the other. [语言设计委员会] 选择提供从字节到字符的隐式转换,因为一个域完全包含在另一个域中。 Right now, however, [the runtime library] only provide Write methods which take chars and ints, which means that bytes print out as characters since that ends up being the best method.然而,现在 [运行时库] 只提供了接受字符和整数的 Write 方法,这意味着字节打印为字符,因为这最终是最好的方法。 We can solve this either by providing more methods on the Writer class or by removing the implicit conversion.我们可以通过在 Writer 类上提供更多方法或删除隐式转换来解决这个问题。

There is an argument for why the latter is the correct thing to do.关于为什么后者是正确的做法存在争议。 After all, bytes really aren't characters .毕竟, bytes 真的不是 characters True, there may be a useful mapping from bytes to chars, but ultimately, 23 does not denote the same thing as the character with ascii value 23, in the same way that 23B denotes the same thing as 23L.确实,从字节到字符的映射可能很有用,但最终,23 并不表示与 ascii 值为 23 的字符相同的事物,就像 23B 表示与 23L 相同的事物一样。 Asking [the library authors] to provide this additional method simply because of how a quirk in our type system works out seems rather weak.仅仅因为我们的类型系统中的一个怪癖是如何工作的,就要求 [图书馆作者] 提供这个额外的方法似乎相当弱。 So I would suggest that we make the conversion from byte to char explicit.所以我建议我们明确地从字节到字符的转换。

The notes then conclude with the decision that byte-to-char should be an explicit conversion, and integer-literal-in-range-of-char should also be an explicit conversion.然后,笔记的结论是字节到字符应该是显式转换,字符范围内整数字面量也应该是显式转换。

Note that the language design notes do not call out why ushort-to-char was also made illegal at the same time, but you can see that the same logic applies.请注意,语言设计说明并没有说明为什么同时将 ushort-to-char 也设为非法,但您可以看到同样的逻辑适用。 When calling a method overloaded as M(int) and M(char), when you pass it a ushort, odds are good that you want to treat the ushort as a number, not as a character.当调用重载为 M(int) 和 M(char) 的方法时,当您将 ushort 传递给它时,很可能您希望将 ushort 视为数字,而不是字符。 And a ushort is NOT a character representation in the same way that a ushort is a numeric representation, so it seems reasonable to make that conversion illegal as well.并且 ushort 不像 ushort 是数字表示那样是字符表示,因此使该转换也非法似乎是合理的。

The decision to make char go to ushort was made on the 17th of September, 1999; 1999 年 9 月 17 日,决定将 char 转为 ushort; the design notes from that day on this topic simply state "char to ushort is also a legal implicit conversion", and that's it.那天关于这个主题的设计说明只是说明“char 到 ushort 也是一种合法的隐式转换”,就是这样。 No further exposition of what was going on in the language designer's heads that day is evident in the notes.笔记中没有进一步说明那天语言设计者头脑中发生的事情。

However, we can make educated guesses as to why implicit char-to-ushort was considered a good idea.但是,我们可以对为什么隐式 char-to-ushort 被认为是一个好主意进行有根据的猜测 The key idea here is that the conversion from number to character is a "possibly dodgy" conversion.这里的关键思想是从数字到字符的转换是一种“可能是狡猾的”转换。 It's taking something that you do not KNOW is intended to be a character, and choosing to treat it as one.它把一些你不知道的东西打算成为一个角色,并选择把它当作一个角色来对待。 That seems like the sort of thing you want to call out that you are doing explicitly, rather than accidentally allowing it.这似乎是您想要明确指出的那种事情,而不是意外地允许它。 But the reverse is much less dodgy.但反过来就没有那么狡猾了。 There is a long tradition in C programming of treating characters as integers -- to obtain their underlying values, or to do mathematics on them. C 编程中有一个很长的传统,即将字符视为整数——以获得它们的基本值,或者对它们进行数学运算。

In short: it seems reasonable that using a number as a character could be an accident and a bug, but it also seems reasonable that using a character as a number is deliberate and desirable.简而言之:使用数字作为字符可能是意外和错误似乎是合理的,但使用字符作为数字是故意和可取的似乎也是合理的。 This asymmetry is therefore reflected in the rules of the language.因此,这种不对称性反映在语言规则中。

Does that answer your question?这是否回答你的问题?

The basic idea is that conversions leading to potential data-loss can be implicit, whereas conversions, which may lead to data-loss have to be explicit (using, for instance, a cast operator).基本思想是导致潜在数据丢失的转换可以是隐式的,而可能导致数据丢失的转换必须是显式的(例如,使用强制转换运算符)。

So implicitly converting from char to int will work in C#.所以从char隐式转换为int将在 C# 中工作。

[edit]As others pointed out, a char is a 16-bit number in C#, so this conversion is just from a 16-bit integer to a 32-bit integer, which is possible without data-loss.[/edit] [编辑]正如其他人指出的那样, char是 C# 中的 16 位数字,因此这种转换只是从 16 位整数到 32 位整数,这可能不会丢失数据。[/编辑]

C# supports implicit conversions, the part "usually don't work" is probably coming from some other language, probably C++, where some glorious string implementations provided implicit conversions to diverse pointer-types, creating some gigantic bugs in applications. C# 支持隐式转换,“通常不起作用”的部分可能来自其他语言,可能是 C++,其中一些出色的string实现提供了对不同指针类型的隐式转换,从而在应用程序中产生了一些巨大的错误。

When you, in whatever language, provide type-conversions, you should also default to explicit conversions by default, and only provide implicit conversions for special cases.当你用任何语言提供类型转换时,你也应该默认默认为显式转换,并且只为特殊情况提供隐式转换。

From C# Specification来自 C# 规范

6.1.2 Implicit numeric conversions The implicit numeric conversions are: 6.1.2 隐式数字转换 隐式数字转换是:

• From sbyte to short, int, long, float, double, or decimal. • 从sbyte 到short、int、long、float、double 或decimal。

• From byte to short, ushort, int, uint, long, ulong, float, double, or decimal. • 从字节到short、ushort、int、uint、long、ulong、float、double 或decimal。

• From short to int, long, float, double, or decimal. • 从short 到int、long、float、double 或decimal。

• From ushort to int, uint, long, ulong, float, double, or decimal. • 从ushort 到int、uint、long、ulong、float、double 或decimal。

• From int to long, float, double, or decimal. • 从int 到long、float、double 或decimal。

• From uint to long, ulong, float, double, or decimal. • 从uint 到long、ulong、float、double 或decimal。

• From long to float, double, or decimal. • 从long 到float、double 或decimal。

• From ulong to float, double, or decimal. • 从ulong 到float、double 或decimal。

• From char to ushort, int, uint, long, ulong, float, double, or decimal. • 从char 到ushort、int、uint、long、ulong、float、double 或decimal。

• From float to double. • 从浮动到双倍。

Conversions from int, uint, long, or ulong to float and from long or ulong to double may cause a loss of precision, but will never cause a loss of magnitude.从 int、uint、long 或 ulong 到 float 以及从 long 或 ulong 到 double 的转换可能会导致精度损失,但永远不会导致幅度损失。 The other implicit numeric conversions never lose any information.其他隐式数字转换永远不会丢失任何信息。 There are no implicit conversions to the char type, so values of the other integral types do not automatically convert to the char type.没有隐式转换到 char 类型,因此其他整数类型的值不会自动转换为 char 类型。

From the MSDN page about the char ( char (C# Reference) :从关于 char ( char (C# Reference)的 MSDN 页面:

A char can be implicitly converted to ushort, int, uint, long, ulong, float, double, or decimal. char 可以隐式转换为 ushort、int、uint、long、ulong、float、double 或 decimal。 However, there are no implicit conversions from other types to the char type.但是,没有从其他类型到 char 类型的隐式转换。

It's because they have implemented an implicit method from char to all those types.这是因为他们已经实现了从 char 到所有这些类型的隐式方法 Now if you ask why they implemented them, I'm really not sure, certainly to help working with ASCII representation of chars or something like that.现在,如果你问他们为什么要实现它们,我真的不确定,当然可以帮助处理字符的 ASCII 表示或类似的东西。

Casting will cause data loss.投射会导致数据丢失。 Here char is 16 bit and int is 32 bit.这里char是 16 位, int是 32 位。 So the cast will happen without loss of data.因此,转换将在不丢失数据的情况下进行。

Real life example: we can put a small vessel into a big vessel but not vice versa without external help.现实生活中的例子:如果没有外部帮助,我们可以将一个小容器放入一个大容器中,反之亦然。

The implicite conversion from char to number types makes no sense, in my opinion, because a loss of information happens.在我看来,从 char 到 number 类型的隐式转换没有意义,因为会发生信息丢失。 You can see it from this example:你可以从这个例子中看到:

string ab = "ab";
char a = ab[0];
char b = ab[1];
var d = a + b;   //195

We have put all pieces of information from the string into chars.我们已将字符串中的所有信息放入字符中。 If by any chance only the information from d is kept, all that is left to us is a number which makes no sense in this context and cannot be used to recover the previously provided information.如果万一只保留了来自 d 的信息,那么留给我们的只是一个数字,在这种情况下没有意义并且不能用于恢复先前提供的信息。 Thus, the most useful way to go would be to implicitely convert the "sum" of chars to a string.因此,最有用的方法是将字符的“总和”隐式转换为字符串。

它起作用是因为每个字符在内部都作为一个数字处理,因此强制转换是隐式的。

char 被隐式​​转换为它的 Unicode 数值,它是一个整数。

The core of @Eric Lippert's blog entry is his educated guess for the reasoning behind this decision of the c# language designers: @Eric Lippert 的博客文章的核心是他对 c# 语言设计者这一决定背后的推理的有根据的猜测:

"There is a long tradition in C programming of treating characters as integers 
-- to obtain their underlying values, or to do mathematics on them."

It can cause errors though, such as:但它可能会导致错误,例如:

var s = new StringBuilder('a');

Which you might think initialises the StringBuilder with an 'a' character to start with, but actually sets the capacity of the StringBuilder to 97 .您可能认为用 'a' 字符开始初始化 StringBuilder, 但实际上将 StringBuilder 的容量设置为 97

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM