简体   繁体   English

读取文本文件时,是否有办法检查Java是否已开始读取文本文件中的新行

[英]Is there a way to check if Java has started reading a new line in a text file when reading in a text file

I'm creating a program which reads in large chunks of data and i will need to separate them and i wanted to know if there is a function in java which would notify me when Java starts reading in a new line, by the way I am using scanners to read in my text files, these files are also CSV files if that changes anything. 我正在创建一个程序,该程序读取大块数据,我需要将它们分开,我想知道java中是否有一个函数,当Java开始以新行开始读取时,该函数会通知我如果使用扫描仪读取我的文本文件,这些文件也将更改为CSV文件。

I've tried looking online for any way of solving this and also read some of the functions of what a scanner can do and couldn't find anything useful 我尝试过在网上寻找解决此问题的方法,还阅读了扫描仪可以执行的功能以及找不到有用的功能

public class ScannerReading { 公共课程ScannerReading {

public static void main(String[] args) throws FileNotFoundException {

    File file = new File("C:\\myfile.txt");
    Scanner scanner = new Scanner(file);
    scanner.useDelimiter(",");
    String data = scanner.nextLine();
    data = scanner.nextLine();

    while(scanner.hasNext()){
        if(data.contains("  ")) {
            System.out.println("I have a line lol");
            }
        System.out.print(data+" ");
        }
    scanner.close();
    }

} }

I am expecting an output of Line 1: INFORMATION EXTRACTED FROM THE FIRST LINE 我期望第一行的输出是从第一行中提取的信息

The direct answer to your question is that there is no way to find out if you have just started a new line when you use Scanner::next / Scanner::hasNext . 您问题的直接答案是,使用Scanner::next / Scanner::hasNext时,无法找出是否刚刚开始新行。 And more generally, there is no way to find out what the last delimiter was. 更一般而言,没有办法找出最后一个分隔符是什么。 The delimiters are discarded. 分隔符将被丢弃。

As JB Nizet says there are lots of existing open source CSV reader libraries, so there is no need to implement this functionality using Scanner . 正如JB Nizet所说的那样,有很多现有的开源CSV阅读器库,因此不需要使用Scanner来实现此功能。 Indeed, implementing CSV reading properly is not trivial, especially if you need to implement headers, quoting, escaping and/or continuation lines. 确实, 正确实现CSV读取并非易事,特别是在需要实现标头,引号,转义和/或续行的情况下。 Using an existing library is advisable. 建议使用现有的库。

But if (against advice!) you decide to implement the reader directly, then a more robust approach is to use a nested loop: 但是,如果(反对建议!)您决定直接实现阅读器,那么更健壮的方法是使用嵌套循环:

  • The outer loop reads complete lines using nextLine 外循环使用nextLine读取完整的行
  • The inner loop creates a Scanner for each line to split it into fields. 内循环为每一行创建一个Scanner ,以将其拆分为多个字段。

Except that that doesn't deal with quoting, escaping, continuation lines, etc. The real problem is that the CSV grammar doesn't have a simple context independent delimiter. 除此之外,这不涉及引号,转义,延续行等。真正的问题是CSV语法没有简单的上下文无关定界符。

I guess I could maybe use some sort of counter to count [fields] 我想我可能可以使用某种计数器来计数[字段]

Yea ... but if some of the lines in your CSV are missing fields (eg due to a human error) then counting the fields won't detect this. 是的...但是,如果CSV中的某些行缺少字段(例如,由于人为错误),则对字段进行计数将无法检测到。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM