简体   繁体   English

将Unicode字符串“ \\ u0063”转换为“ c”

[英]converting string of unicode “\u0063” into “c”

I'm doing some cryptoanalysis homework and was trying to write code that does a + b = c. 我正在做一些密码分析作业,并试图编写执行a + b = c的代码。 My idea was to use unicode. 我的想法是使用unicode。 b +(ba) = c. b +(ba)= c。 Problem is my code returns a the unicode value of c not the String "c" and I can't convert it. 问题是我的代码返回了c的unicode值而不是字符串“ c”,并且我无法将其转换。

Please can someone explain the difference between the string below called unicode and those called test and test2? 请问有人可以解释下面称为unicode的字符串与称为test和test2的字符串之间的区别吗? Also is there any way I could get the string unicodeOfC to print "c"? 还有什么办法可以使字符串unicodeOfC打印“ c”?

//this calculates the unicode value for c
String unicodeOfC = ("\\u" + Integer.toHexString('b'+('b'-'a') | 0x10000).substring(1));

//this prints \u0063
System.out.println(unicodeOfC);

String test = "\u0063";

//this prints c
System.out.println(test);

//this is false
System.out.println(test.equals(unicodeOfC));

String test2 = "\u0063";
//this is true
System.out.println(test.equals(test2));

There is no difference between test and test2 . testtest2之间没有区别。 They are both String literals referring to the same String . 它们都是引用同一个String String 文字 This String literal is made up of a unicode escape . String文字由unicode转义符组成

A compiler for the Java programming language ("Java compiler") first recognizes Unicode escapes in its input, translating the ASCII characters \\u\u003c/code> followed by four hexadecimal digits to the UTF-16 code unit (§3.1) for the indicated hexadecimal value , and passing all other characters unchanged. 用于Java编程语言的编译器(“ Java编译器”)首先识别其输入中的Unicode转义, 将ASCII字符\\u\u003c/code>紧跟其后的四个十六进制数字转换为UTF-16代码单元(第3.1节)以表示所指示的十六进制值 ,并且传递所有其他字符不变。

So the compiler will translate this unicode escape and convert it to the corresponding UTF-16 code unit. 因此,编译器将转换此unicode转义并将其转换为相应的UTF-16代码单元。 That is, the unicode escape \c translates to the character c . 也就是说,Unicode转义\c转换为字符c

In this 在这个

String unicodeOfC = ("\\u" + Integer.toHexString('b'+('b'-'a') | 0x10000).substring(1));

the String literal "\\\\u\u0026quot; (which uses a \\ character to escape a \\ character) has a runtime value of \\u\u003c/code> , ie. String文字"\\\\u\u0026quot; (使用\\字符转义\\字符)的运行时值为\\u\u003c/code> ,即。 the two character \\ and u . 两个字符\\u That String is concatenated with the result of invoking toHexString(..) . String与调用toHexString(..)的结果连接在一起。 You then invoke substring on the resulting String and assign its result to unicodeOfC . 然后,您在结果String上调用substring String并将其结果分配给unicodeOfC So the String value is \c , ie. 因此, String值为\c ,即。 the 6 characters \\ , u , 0 , 0 , 6 , and 3 . 的6个字符\\u006 ,和3

Also is there any way I could get the string unicodeOfC to print "c"? 还有什么办法可以使字符串unicodeOfC打印“ c”?

Similarly to how you created it, you need to get the numerical part of the unicode escape, 与创建方式类似,您需要获取unicode转义的数字部分,

String numerical = unicodeOfC.replace("\\u", "");
int val = Integer.parseInt(numerical, 16);
System.out.println((char) val);

You can then print it out. 然后可以将其打印出来。

I think you're not understanding how string escaping works. 我认为您不了解字符串转义的工作原理。

In Java backslash is an escape character that allows you to use characters in strings like newlines \\n , tabs \\t , or unicode \c . 在Java中,反斜杠是转义字符,它允许您在字符串中使用字符,例如换行符\\n ,制表符\\t或unicode \c

Suppose I am writing code and I need to print a newline. 假设我正在编写代码,并且需要打印换行符。 I would do this System.out.println("\\n"); 我会这样做System.out.println("\\n");

Now lets say I want to show a backslash, System.out.println("\\"); 现在说我想显示一个反斜杠, System.out.println("\\"); will be a compile error but System.out.println("\\\\"); 将是一个编译错误,但System.out.println("\\\\"); will print \\ . 将打印\\

So your first string is printing the literal backslash character then the letter u then the hexadecimal number. 因此,您的第一个字符串是打印文字反斜杠字符,然后是字母u,然后是十六进制数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM