简体   繁体   English

RandomAccessFile问题

[英]RandomAccessFile issue

I have to listern a file, when its content is added, I will read the new line, and work on the content of the new line. 我必须保存一个文件,当添加其内容时,我将读取新行,并处理新行的内容。 The file's length will never decrease.(in fact, it is the tomcat log file). 文件的长度永远不会减少。(实际上,它是tomcat日志文件)。

I use the following codes: 我使用以下代码:


import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.RandomAccessFile;

import org.apache.log4j.Logger;

import com.zjswkj.analyser.ddao.LogEntryDao;
import com.zjswkj.analyser.model.LogEntry;
import com.zjswkj.analyser.parser.LogParser;

public class ListenTest {
    private RandomAccessFile    raf;
    private long                lastPosition;
    private String              logEntryPattern = "^([\\d.]+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\S+) \"([^\"]+)\" \"([^\"]+)\"";
    private static Logger       log             = Logger.getLogger(ListenTest.class);

    public void startListenLogOfCurrentDay() {

        try {
            if (raf == null)
                raf = new RandomAccessFile(
                        "/tmp/logs/localhost_access_log.2010-12-20.txt",
                        "r");
            String line;
            while (true) {
                raf.seek(lastPosition);
                while ((line = raf.readLine()) != null) {
                    if (!line.matches(logEntryPattern)) {
                        // not a complete line,roll back
                        lastPosition = raf.getFilePointer() - line.getBytes().length;
                        log.debug("roll back:" + line.getBytes().length + " bytes");
                        if (line.equals(""))
                            continue;
                        log.warn("broken line:[" + line + "]");
                        Thread.sleep(2000);
                    } else {
                        // save it
                        LogEntry le = LogParser.parseLog(line);
                        LogEntryDao.saveLogEntry(le);
                        lastPosition = raf.getFilePointer();
                    }
                }
            }
        } catch (FileNotFoundException e) {
            log.error("can not find log file of today");
        } catch (IOException e) {
            log.error("IO Exception:" + e.getMessage());
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        new ListenTest().startListenLogOfCurrentDay();
    }
}

Now, my problem is that, if a line which is being written to the file's new line is not completed, a dead loop will occur. 现在,我的问题是,如果没有完成写入文件新行的行,将发生死循环。

For example, if the tomcat try to write to the file a new line: 例如,如果tomcat尝试向该文件写入一个新行:

10.33.2.45 - - [08/Dec/2010:08:44:43 +0800] "GET /poi.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"

And when only one part of the line is written(for example:< 10.33.2.45 - - [08/Dec/2010:08:44:43 +0800] "GET /poi.txt HTTP/1.1" 200 672 >), now since it can not match the pattern I defined,that's to say, tomcat do not complete its writing work, so I will try to roll back the filepointer, and sleep 2 seconds and then read again. 当只写一行的部分时(例如:< 10.33.2.45 - - [08 / Dec / 2010:08:44:43 +0800]“GET /poi.txt HTTP / 1.1”200 672 >),现在因为它无法匹配我定义的模式,也就是说,tomcat没有完成它的写入工作,所以我会尝试回滚文件指针,然后睡2秒再读一遍。

During the sleep time,the last part of the line maybe written yet (in fact I write them rather than tomcat for test), in my opinion, randomaccessfile will read a new line which can match the pattern, however it seems not. 在睡眠时间,行的最后部分可能已写入(事实上我写它们而不是tomcat进行测试),在我看来,randomaccessfile将读取一个可以匹配模式的新行,但似乎没有。

Any one can have a check the codes? 任何人都可以检查代码?

NOTE : the log file's format is "combined" like this: 注意 :日志文件的格式是“组合”的,如下所示:

10.33.2.45 - - [08/Dec/2010:08:44:43 +0800] "GET /poi.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"

I see (from your code) that your main objective is to filter the log entries/events and then write the filtered logs to database. 我(从您的代码中)看到您的主要目标是过滤日志条目/事件 ,然后将过滤后的日志写入数据库。 You have 2 options 你有2个选择

Option 1: Best and the right way to do. 选项1: 最佳和正确的方法。 But you should be able to change the log4j config file that comes with tomcat 但是您应该能够更改tomcat附带的log4j配置文件

If this is the case then the best way to do this is to use log4j's predefined extension points. 如果是这种情况,那么最好的方法是使用log4j的预定义扩展点。 In your case the tapping point is the Appender 在您的情况下,攻丝点是Appender

Log4j already comes with the DBAppender that you might want to extend to filter the logs using your regular expression and then delegate the rest to DBAppender as it is well tested. Log4j已经附带了DBAppender ,您可能希望扩展它以使用正则表达式过滤日志,然后将其余部分委托给DBAppender,因为它经过了充分测试。 Below is an example on how to configure the custome appender 以下是有关如何配置custome appender的示例

log4j.rootLogger=DEBUG, S log4j.rootLogger = DEBUG,S

log4j.appender.S=com.gurock.smartinspect.log4j.MyCustomAppender log4j.appender.S = com.gurock.smartinspect.log4j.MyCustomAppender

log4j.appender.S.layout=org.apache.log4j.SimpleLayout log4j.appender.S.layout = org.apache.log4j.SimpleLayout定义

I suggest you also look at using the AsyncAppender and DBAppender if you want to improve the performance. 如果您想提高性能,我建议您也考虑使用AsyncAppender和DBAppender。

Option 2: Fallback option if you doesn't have access to the tomcat's log4j config file 选项2:如果您无权访问tomcat的log4j配置文件,则使用后备选项

Instead of writing your own file change listener, look this post in SO . 而不是编写自己的文件更改侦听器,请在SO中查看此帖子 Choose the one that best matches your needs. 选择最符合您需求的产品。 You are then only left with writing code for filtering and persisting the log in DB. 然后,您只剩下编写代码来过滤和保存数据库中的日志。 You can use this link as an example for dealing with RandomAccessFile. 您可以使用此链接作为处理RandomAccessFile 的示例

RAF's readline is a blocking method and is inefficient (reads byte by byte and makes so many system calls) Also note that in your code lines.getBytes().length cannot be accurately used as the readLine method skips newline/carriage return chars. RAF的readline是一种阻塞方法,并且效率低(逐字节读取并进行如此多的系统调用)另请注意,在代码中,line.getBytes()。长度无法准确使用,因为readLine方法会跳过换行符/回车符。

To use BufferedReader on RAF check my answer here https://stackoverflow.com/a/19867481/1282907 要在RAF上使用BufferedReader,请在此处查看我的答案https://stackoverflow.com/a/19867481/1282907

I think it is not a good way of checking new added lines. 我认为这不是检查新添加的行的好方法。 I recommend you writing a custom appender for log4j. 我建议你为log4j编写一个自定义appender。 With a custom appender you can get every new added lines with an event. 使用自定义appender,您可以通过事件获得每个新添加的行。 There is a sample here 有一个样品在这里

And google for custom appender. 和谷歌的自定义appender。

The first thing I would do in this situation were to separate the issue of reading a growing file from the issue of processing the lines. 在这种情况下,我要做的第一件事是将阅读不断增长的文件的问题与处理线条的问题分开。

Create a class GrowingFileReader whose readLine method does what you want. 创建一个GrowingFileReader类,其readLine方法GrowingFileReader您的需要。 Then the rest of the code becomes simpler. 然后其余代码变得更简单。

In the case of a failed match, why do you update lastPosition at all? 如果匹配失败,为什么要更新lastPosition Shouldn't it be left as is? 它不应该保留原样吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM