简体   繁体   English

Java将字符串打印为Unicode

[英]Java print string as unicode

I was processing some data tweeter using java. 我正在使用Java处理一些数据高频扬声器。 I read them from the file, do some process and print to the stdout . 我从文件中读取它们,进行一些处理并打印到stdout
The text in file looks like this: 文件中的文本如下所示:

"RT @Bollogosta319a: #BuyBookSilentSinners \☯Gain Followers\\n\☯RT This\\n\☯MUST FOLLOW ME I FOLLOW BACK\\n\☯Follow everyone who rts\\n\☯Gain\\n #ANDROID \…" “ RT @ Bollogosta319a:#BuyBookSilentSinners \\ u262fGain Followers \\ n \\ u262fRT This \\ n \\ u262fFollow Me I Followolour \\ n \\ u262f关注所有rts \\ n \\ u262fGain \\ n #ANDROID \\ u2026的人

I read it in, and print it out to stdout. 我将其读入并打印到stdout。 The output is supposed to be: 输出应该是:

"RT @Bollogosta319a: #BuyBookSilentSinners ☯Gain Followers\\n☯RT This\\n☯MUST FOLLOW ME I FOLLOW BACK\\n☯Follow everyone who rts\\n☯Gain\\n #ANDROID …" “ RT @ Bollogosta319a:#BuyBookSilentSinners☯GainFollowers \\n☯RTThis \\n☯FollowMe I跟进我\\n☯关注所有rts \\n☯Gain\\ n #ANDROID…”

But my output is like this: 但是我的输出是这样的:

"RT @Bollogosta319a: #BuyBookSilentSinners ?Gain Followers ?RT This ?MUST FOLLOW ME I FOLLOW BACK ?Follow everyone who rts ?Gain #ANDROID ?" “ RT @ Bollogosta319a:#BuyBookSilentSinners吗?获得关注者吗?RT这吗?我必须跟着我跟着回来?跟随所有rts的人吗?获得#ANDROID吗?”

So, it seems that I have two problems to deal with: 因此,似乎我有两个问题要处理:
1. print the exact Unicode character instead of Unicode string 1.打印确切的Unicode character而不是Unicode string
2. keep "\\n" as it is, instead of a newline in the output. 2.保持"\\n"不变,而不是输出中的换行符。

How can I do this? 我怎样才能做到这一点? (I'm really crazy about dealing with different coding in Java) (我真的为处理Java中的不同编码而疯狂)

I don't know how you are parsing the file, but the method you are using seems to be interpreting escape codes (like \\n and \☯ ). 我不知道您是如何解析文件的,但是您使用的方法似乎正在解释转义码(例如\\n\☯ )。 To leave instances of \\n in the file literally, you could replace \\n with \\\\n prior to using whatever means of interpreting the escape codes. 要将\\n实例保留在文件中,可以使用任何解释转义码的方法将\\n替换为\\\\n The \\\\ will be converted to a single \\ , and the n will be left alone. \\\\将被转换为单个\\ ,而n将被保留。 Have you tried using a plain java.io.FileReader to read the file? 您是否尝试过使用普通的java.io.FileReader读取文件? That may be simpler. 那可能更简单。

The Unicode symbols may actually be read correctly; Unicode符号实际上可以正确读取; many terminals do not support the full range of Unicode characters and print some symbol in place of those it does not understand. 许多终端不支持全部Unicode字符,并打印一些符号来代替不理解的字符。 Perhaps your program prints and the terminal simply doesn't know how to render it, so it prints a ? 也许您的程序会打印 ,而终端根本不知道如何渲染它,所以它会打印? instead. 代替。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM