简体   繁体   English

txt文件中的特殊字符

[英]Special characters from txt file

I am downloading a text file from ftp, with common ftp library. 我正在使用通用的ftp库从ftp下载文本文件。

The problem is when i read the file into an array line by line, it doesnt take characters such as æøå. 问题是,当我将文件逐行读入数组时,它没有诸如æøå之类的字符。 Instead it just show the "?" 相反,它仅显示“?” character. 字符。

Here is my code 这是我的代码

  FileInputStream fstream = openFileInput("name of text file");
  BufferedReader br = new BufferedReader(new InputStreamReader(fstream, "UTF-8"));
  String strLine;

  ArrayList<String> lines = new ArrayList<String>(); 

  while ((strLine = br.readLine()) != null)   {
      lines.add(strLine);
  }

  String[] linjer = lines.toArray(new String[0]);

  ArrayList<String> imei = new ArrayList<String>(); 

  for(int o=0;o<linjer.length;o++)
  {
      String[] holder = linjer[o].split(" - ");
      imei.add(holder[0] + " - " + holder[2]);
  }

  String[] imeinr = imei.toArray(new String[0]);

I have tried to put UTF-8 in my inputstreamreader, and i have tried with a UnicodeReader class, but with no success. 我尝试将UTF-8放入inputstreamreader中,并尝试使用UnicodeReader类,但未成功。

I am fairly new to Java, so might just be some stupid question, but hope you can help. 我对Java相当陌生,因此可能只是一个愚蠢的问题,但希望您能为您提供帮助。 :) :)

There is no reason to use a DataInputStream . 没有理由使用DataInputStream The DataInputStream and DataOutputStream classes are used for serializing primitive Java data types ("serializing" means reading/writing data to a file). DataInputStreamDataOutputStream类用于序列化原始Java数据类型(“序列化”是指将数据读/写到文件中)。 You are just reading the contents of a text file line by line, so the use of DataInputStream is unnecessary and may produce incorrect results. 您只是逐行读取文本文件的内容,因此不需要使用DataInputStream并可能产生错误的结果。

FileInputStream fstream = openFileInput("name of text file");
//DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream, "UTF-8"));

Professional Java Programmer Tip : The foreach loop was recently added to the Java programming language. 专业Java程序员提示 :最近将foreach循环添加到Java编程语言中。 It allows the programmer to iterate through the contents of an array without needing to define a loop counter. 它允许程序员在不需要定义循环计数器的情况下遍历数组的内容。 This simplifies your code, making it easier to read and maintain over time. 这样可以简化您的代码,使其更易于阅读和维护。

for(String line : linjer){
  String[] holder = line.split(" - ");
  imei.add(holder[0] + " - " + holder[2]);
}

Note: Foreach loops can also be used with List objects. 注意:Foreach循环也可以与List对象一起使用。

I would suggest that the file may not be in UTF-8. 我建议该文件可能不在UTF-8中。 It could be in CP1252 or something, especially if you're using Windows. 它可能在CP1252中或其他版本中,尤其是在使用Windows的情况下。

Try downloading the file and running your code on the local copy to see if that works. 尝试下载文件并在本地副本上运行代码以查看是否可行。

FTP has two modes binary and ascii. FTP具有二进制和ascii两种模式。 Make sure you are using the correct mode. 确保使用正确的模式。 Look here for details: http://www.rhinosoft.com/newsletter/NewsL2008-03-18.asp 在此处查看详细信息: http : //www.rhinosoft.com/newsletter/NewsL2008-03-18.asp

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM