[英]Java print string as unicode
I was processing some data tweeter using java. 我正在使用Java处理一些数据高频扬声器。 I read them from the file, do some process and print to the stdout
. 我从文件中读取它们,进行一些处理并打印到stdout
。
The text in file looks like this: 文件中的文本如下所示:
"RT @Bollogosta319a: #BuyBookSilentSinners \☯Gain Followers\\n\☯RT This\\n\☯MUST FOLLOW ME I FOLLOW BACK\\n\☯Follow everyone who rts\\n\☯Gain\\n #ANDROID \…" “ RT @ Bollogosta319a:#BuyBookSilentSinners \\ u262fGain Followers \\ n \\ u262fRT This \\ n \\ u262fFollow Me I Followolour \\ n \\ u262f关注所有rts \\ n \\ u262fGain \\ n #ANDROID \\ u2026的人
I read it in, and print it out to stdout. 我将其读入并打印到stdout。 The output is supposed to be: 输出应该是:
"RT @Bollogosta319a: #BuyBookSilentSinners ☯Gain Followers\\n☯RT This\\n☯MUST FOLLOW ME I FOLLOW BACK\\n☯Follow everyone who rts\\n☯Gain\\n #ANDROID …" “ RT @ Bollogosta319a:#BuyBookSilentSinners☯GainFollowers \\n☯RTThis \\n☯FollowMe I跟进我\\n☯关注所有rts \\n☯Gain\\ n #ANDROID…”
But my output is like this: 但是我的输出是这样的:
"RT @Bollogosta319a: #BuyBookSilentSinners ?Gain Followers ?RT This ?MUST FOLLOW ME I FOLLOW BACK ?Follow everyone who rts ?Gain #ANDROID ?" “ RT @ Bollogosta319a:#BuyBookSilentSinners吗?获得关注者吗?RT这吗?我必须跟着我跟着回来?跟随所有rts的人吗?获得#ANDROID吗?”
So, it seems that I have two problems to deal with: 因此,似乎我有两个问题要处理:
1. print the exact Unicode character
instead of Unicode string
1.打印确切的Unicode character
而不是Unicode string
2. keep "\\n"
as it is, instead of a newline in the output. 2.保持"\\n"
不变,而不是输出中的换行符。
How can I do this? 我怎样才能做到这一点? (I'm really crazy about dealing with different coding in Java) (我真的为处理Java中的不同编码而疯狂)
I don't know how you are parsing the file, but the method you are using seems to be interpreting escape codes (like \\n
and \☯
). 我不知道您是如何解析文件的,但是您使用的方法似乎正在解释转义码(例如\\n
和\☯
)。 To leave instances of \\n
in the file literally, you could replace \\n
with \\\\n
prior to using whatever means of interpreting the escape codes. 要将\\n
实例保留在文件中,可以使用任何解释转义码的方法将\\n
替换为\\\\n
。 The \\\\
will be converted to a single \\
, and the n
will be left alone. \\\\
将被转换为单个\\
,而n
将被保留。 Have you tried using a plain java.io.FileReader
to read the file? 您是否尝试过使用普通的java.io.FileReader
读取文件? That may be simpler. 那可能更简单。
The Unicode symbols may actually be read correctly; Unicode符号实际上可以正确读取; many terminals do not support the full range of Unicode characters and print some symbol in place of those it does not understand. 许多终端不支持全部Unicode字符,并打印一些符号来代替不理解的字符。 Perhaps your program prints ☯
and the terminal simply doesn't know how to render it, so it prints a ?
也许您的程序会打印☯
,而终端根本不知道如何渲染它,所以它会打印?
instead. 代替。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.