简体   繁体   中英

How to write Hebrew strings to a log file using log4j

How to write a Hebrew string to a log4j file. Right now I see ?????? in the file.

I have searched everywhere online to convert Unicode to string:

String abc = myStr.replaceAll("\u200F", "");
   abc = abc.replaceAll("\u200E", "");
   byte[] utf8Bytes = abc.getBytes(Charset.forName("UTF-8"));
   String value = new String(utf8Bytes);
   log.debug("value : "+ value );

I just need to write out a Hebrew string to a Log4j file in a readable format. Here is my configuration:

log4j.rootLogger=debug, stdout, R log4j.logger.testlogging=DEBUG 
log4j.appender.stdout=org.apache.log4j.ConsoleAppender     
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout     
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd} %5p [%t] (%F:%L) - %m%n log4j.appender.R=org.apache.log4j.RollingFileAppender     
log4j.appender.R.File=C:\\dri\\ums.log log4j.appender.R.MaxBackupIndex=5     
log4j.appender.R.layout=org.apache.log4j.PatternLayout     
log4j.appender.R.layout.ConversionPattern= %d{dd MMM yyyy HH:mm:ss,SSS} %5p [%t] (%F:%L) - %m%n log4j.appender.FILE.encoding=UTF-8 

Based from what I've gathered from the comments and my own experience this is most probably not an issue with Log4j itself. I've posted a comment indicating just that:

What exactly do you mean with Log4j file? Is it a regular text log file that the FileAppender points to? Because I've tried printing Hebrew text right now and all is working out fine. I believe this is not a Log4j issue and might be related to your text reader.

Other comments have confirmed their suspicion that it is your text reader that might be causing this issue. I was able to reproduce your issue by doing the following in Notepad++ :

  • Open a new tab in notepad++.
  • Copy and paste sample text containing Hebrew letters.
  • Language -> Convert to ANSI

Text before the conversion:

See also: אלף־בית‎ and אַלף־בית‎

Text after conversion:

See also: ???????? and ?????????

Based on the code you provided (assuming there is no shenanigans that we don't know for behind the scenes) we can definitively conclude you are either writing to a file that has it's encoding set to ANSI where all your special characters are being converted to question marks because they cannot be decoded or your characters are being read as UTF-8 but merely displayed as ANSI .

ANSI and UTF-8 are both encoding formats. ANSI is the common one byte format used to encode Latin alphabet; whereas, UTF-8 is a Unicode format of variable length (from 1 to 4 bytes) which can encode all possible characters.

I would recommend following these steps:

  • Navigate to Settings -> Preferences -> New Document -> Encoding and make sure that the UTF-8 (Apply to opened ANSI files) option is selected.

  • Close all your files currently opened in Notepad++ and delete the log file. Make sure you are actually closing the files instead of just closing Notepad++ . This should clear the file entries from cache and allow you to open them again with a different encoding.

  • Run your Java application and let Log4j print to the file.

  • Open the file with Notepad++ and check that you are encoding in UTF-8 by clicking on Encoding tab. If the option is not set to UTF-8 , change it.

  • If none of the above worked, please post further information in the comments.

Unfortunately I am not that well versed in encoding matters and had to look some stuff up in the process of writing this, so I can't help you as much as I would like to. However in addition to providing the steps above I can direct you to the following links which should provide you with further knowledge and (consequently) more insight into your problem:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM