简体   繁体   English

如何从Java中的大型文本文件读取整数/双精度数

[英]How to read integers/doubles from a large text file in Java

I am making a Pi based RNG(Random Number Generator) for a research project. 我正在为研究项目制作基于Pi的RNG(随机数生成器)。 I am getting stumped at this point hence I cant seem to figure out how to read the digits form a rather large file (1GB). 我现在很困惑,因此我似乎无法弄清楚如何从一个相当大的文件(1GB)中读取数字。 Here is the input: 这是输入:

........ ........

File is ugly I know... its Pi to 1 Billionth decimal place. 我知道文件很丑...它的Pi达到小数点后10亿位。 I am not going into details on why I am doing this but here is my goal. 我没有详细说明为什么要这样做,但这是我的目标。 I want to be able to skip x number of decimal places before beginning printing output, I also need to be able to read out y number of consecutive digits at a time so like if it was 4 at a time output would look like: 我希望能够在开始打印输出之前跳过x个小数位,我还需要一次能够读取y个连续数字,所以如果一次是4,则输出看起来像:

1111\\n 2222\\n 3333\\n 4444\\n.... 1111 \\ n 2222 \\ n 3333 \\ n 4444 \\ n ....

My base objective is to be able to print at least 1 number at a time hence after that I can piece them together how I want... So basic output is: 我的基本目标是能够一次打印至少一个数字,因此之后我可以按自己的意愿将它们拼凑在一起。因此,基本输出是:

For input 3.1415.. I get.. 3,1,4,1,5.... 输入3.1415 ..我得到.. 3,1,4,1,5 ....

I tried bunch of File Streams from Java API but it only prints bytes/bits... I have no idea on how to convert them to something meaningful. 我试过一堆来自Java API的文件流,但它只输出字节/位...我不知道如何将它们转换为有意义的东西。

Also, Reading line by line is not optimal hence I have to have my numbers be same length and I feel like reading line by line would cut them off in a funny way.. 另外,逐行读取也不是最佳选择,因此我必须让我的数字长度相同,并且我觉得逐行读取会以一种有趣的方式切断它们。

What you need is a character stream , basically a subclass of Reader , so you can read character by character, rather than byte by byte. 您需要的是一个字符流 ,基本上是Reader的子类,因此您可以按字符读取字符,而不是按字节读取。

To achive what you need, you will have to: 为了实现您的需求,您将必须:

  • List item 项目清单
  • open a character stream to the file containing your input digits. 在包含您输入数字的文件中打开字符流。 Prefer a BufferedReader over a FileReader to speed up the I/O, since reading char by char can be very slow, especially with large files 最好使用BufferedReader而不是FileReader来加快I / O速度,因为逐字符读取char可能非常慢,尤其是对于大文件
  • you will need to keep track of the previous character read (if any) and group strings of identical characters in an appropriate data strcuture (for instance a StringBuilder ) 您将需要跟踪先前读取的字符(如果有),并在适当的数据结构(例如StringBuilder )中将相同字符的字符串分组。
  • if you need to skip the first n characters, use Reader.skip(n); 如果需要跳过前n字符,请使用Reader.skip(n); at the start 在开始时

The following code does exactly what I understand of your requirements: 以下代码完全符合我对您的要求的理解:

public class Test {
  public static void main(String[] args) {
    final char decimalSeparator = ',';
    try (Reader reader = new BufferedReader(new FileReader("pi.txt"))) {
      int prevC = -1; // previous character read from the stream
      int c; // latest character read from the stream
      StringBuilder sb = new StringBuilder();
      while ((c = reader.read()) != -1) {
        // if first digit or same as previous digit
        if ((prevC == -1) || (c == prevC)) {
          sb.append((char) c);
        } else {
          // print the group of digits and reset sb
          if (sb.length() > 0) {
            System.out.println(sb.toString());
            sb = new StringBuilder();
          }
          sb.append((char) c);
        }
        prevC = c;
      }
      // print the last digits group
      if (sb.length() > 0) {
        System.out.println(sb.toString());
      }
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Okay I have spoken to a CS professor and it seems that I have forgotten my basic Java training. 好的,我已经与CS教授交谈过,看来我已经忘记了基本的Java培训。 1Byte = 1 char. 1Byte = 1个字符。 In this case BufferedInputReader spits out ASCII values for said chars. 在这种情况下,BufferedInputReader会为这些字符吐出ASCII值。 Here is simple solution: 这是简单的解决方案:

FileInputStream ifs = new FileInputStream(pi); //Input File containing 1 billion digits
BufferedInputStream bis = new BufferedInputStream(ifs);
System.out.println((char)bis.read()); //Build strings or parse chars how you want

..Rinse and repeat. ..冲洗并重复。 Sorry for wasting time... but I hope this will set someone one the right track down the road. 很抱歉浪费时间...但是我希望这能使某人踏上正确的道路。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM