简体   繁体   English

在线程中读取时,BufferedReader readline

[英]BufferedReader readline when reading in a thread

I'm new to concurrent programming in java. 我是Java并发编程的新手。

I need to read, analyze and process an extremely fast growing logfile, so I got to be fast. 我需要阅读,分析和处理一个快速增长的日志文件,因此我必须保持快速。 My idea was to read the file (line by line) and upon matching a relevant line I want to pass those lines to separate threads that can do further processing on the line. 我的想法是(逐行)读取文件,并且在匹配相关行之后,我希望将这些行传递给单独的线程,这些线程可以对该行进行进一步处理。 I called these threads "IOThread" in the following example code. 在以下示例代码中,我将这些线程称为“ IOThread”。

My problem is that the BufferedReader readline in IOthread.run() apparently never returns. 我的问题是,IOthread.run()中的BufferedReader readline显然永远不会返回。 What is a working way to read the Stream inside the thread? 什么是读取线程内Stream的有效方法? Are there any better approaches than the one below? 有没有比下面更好的方法了?

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;

class IOThread extends Thread {
    private InputStream is;
    private int t;

    public IOThread(InputStream is, int t)  {
        this.is = is;
        this.t = t;
        System.out.println("iothread<" + t + ">.init");
    }

    public void run() {
        try {
            System.out.println("iothread<" + t + ">.run");
            String line;

            BufferedReader streamReader = new BufferedReader(new InputStreamReader(is));
            while ((line = streamReader.readLine()) != null) {
                System.out.println("iothread<" + t + "> got line " + line);
            }
            System.out.println("iothread " + t + " end run");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

public class Stm {
    public Stm(String filePath) {
        System.out.println("start");

        try {
            BufferedReader reader = new BufferedReader(new FileReader(filePath));

            PipedOutputStream po1 = new PipedOutputStream();
            PipedOutputStream po2 = new PipedOutputStream();
            PipedInputStream pi1 = new PipedInputStream(po1);
            PipedInputStream pi2 = new PipedInputStream(po2);
            IOThread it1 = new IOThread(pi1,1);
            IOThread it2 = new IOThread(pi2,2);

            it1.start();
            it2.start();
//          it1.join();
//          it2.join();

            String line;
            while ((line = reader.readLine()) != null) {
                System.out.println("got line " + line);

                if (line.contains("aaa")) {
                    System.out.println("passing to thread 1: " + line);  
                    po1.write(line.getBytes());
                } else if (line.contains("bbb")) {
                    System.out.println("passing to thread 2: " + line);  
                    po2.write(line.getBytes());
                }
            }
            reader.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        new Stm(args[0]);
    }

}

An example input file would be: 输入文件示例为:

line 1
line 2
line 3 aaa ...
line 4
line 5 bbb ...
line 6 aaa ...
line 7
line 8 bbb ...
line 9 bbb ...
line 10

Call the above code with the filename of the input file as argument. 使用输入文件的文件名作为参数调用以上代码。

Your reader in your iothread keeps stuck in the head of the first iteration of your while-loop for the following reason: you pass the content of the read line from your STM thread, but you do not append a new line character (\\n). 出于以下原因,您的iothread中的阅读器始终停留在while循环的第一次迭代的开头:您从STM线程传递读取行的内容,但不添加新的行字符(\\ n) 。 Since your buffered reader waits for a new line character (as in .readLine()) it waits forever. 由于您的缓冲读取器等待换行符(如.readLine()中的一样),因此它将永远等待。 You could modify your code like this: 您可以这样修改代码:

   if (line.contains("aaa")) {
                System.out.println("passing to thread 1: " + line);  
                byte[] payload = (line+"\n").getBytes();
                po1.write(payload);
            } else if (line.contains("bbb")) {
                System.out.println("passing to thread 2: " + line);  
                byte[] payload = (line+"\n").getBytes();
                po2.write(payload);
            }

But I have to say that this is not at all an elegant solution, you could use a blocking queue or something similar to supply your IOThreads with content. 但是我不得不说这根本不是一个很好的解决方案,您可以使用阻塞队列或类似的方法为IOThreads提供内容。 This way you can avoid converting your input to strings to bytes and back to strings (not speaking getting rid of all the streams). 这样,您可以避免将输入转换为字符串转换为字节再转换为字符串(而不是摆脱所有流)。

IMHO you have got it backwards. 恕我直言,你已经倒退了。 Create multiple threads for "processing" stuff and not for reading data from the file. 创建多个线程以“处理”东西,而不是从文件中读取数据。 When reading data from file, you are anyways bottlenecked so having multiple threads won't make any difference. 从文件读取数据时,总会遇到瓶颈,因此拥有多个线程不会有任何区别。 The simplest solution is to read lines as fast as you can in a given thread and store the lines in a shared queue. 最简单的解决方案是在给定线程中以最快的速度读取行,并将行存储在共享队列中。 This queue can then be accessed by any number of threads to do the relevant processing. 然后,任何数量的线程都可以访问此队列以进行相关处理。

This way, you can actually do concurrent processing stuff while the I/O or reader thread is busy reading/waiting for the data. 这样,您实际上可以在I / O或读取器线程忙于读取/等待数据的同时进行并发处理。 If possible, keep the "logic" to a minimum in the reader thread. 如果可能的话,将读取器线程中的“逻辑”保持在最低限度。 Just read those lines and let the worker threads do the real heavy lifting stuff (matching pattern, further processing etc.). 只需阅读这些行,然后让工作线程执行真正繁重的工作(匹配模式,进一步处理等)。 Just go with a thread safe queue and you should be kosher. 随线程安全队列一起去,您应该是洁净的。

EDIT: Use some variant of the BlockingQueue , either array based or linked list based. 编辑:使用BlockingQueue某些变体,基于数组或基于链表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM