简体   繁体   English

Storm的Java UTF8编码问题

[英]Java UTF8 encoding issues with Storm

Pretty desperate for help after 2 days trying to debug this issue. 尝试调试此问题2天后,我们非常渴望获得帮助。

I have some text that contains unicode characters, for example, the word: 我有一些包含Unicode字符的文本,例如单词:

korte støvler

If I run code that writes this word to a file on one of the problem machines, it works correctly. 如果我运行将这个单词写到问题机器之一上的文件中的代码,它将正常工作。 However, when I write the file exactly the same way in a storm bolt, it does not encode correctly, and the ø character is replaced with question marks. 但是,当我用防暴螺栓完全相同地写入文件时,它无法正确编码,并且将ø字符替换为问号。

In the storm_env.ini file I have set 在storm_env.ini文件中,我已经设置

STORM_JAR_JVM_OPTS:-Dfile.encoding=UTF-8

I also set the encoding as UTF-8 in the code, and in mvn when it is packaged. 我还在代码中以及在打包时在mvn中将编码设置为UTF-8。

I have run tests on the boxes to check JVM default encodings, and they are all UTF-8. 我已经在盒子上运行了测试以检查JVM默认编码,它们都是UTF-8。

I have tried 3 different methods of writing the file and all cause the same issue, so it is definitely not that. 我尝试了3种不同的写入文件的方法,并且都导致相同的问题,所以绝对不是那样。

This issue was fixed by simply build another machine on ec2. 只需在ec2上构建另一台计算机即可解决此问题。 It had exactly the same software versions and configuration as the boxes with issues. 它的软件版本和配置与有问题的盒子完全相同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM