简体   繁体   中英

Spark streaming one log file with java doesn't generate any output

I want to stream a log file by java and spark. My code is simple:

  String base = "c:/test";

    SparkConf conf = new SparkConf().setAppName("First_App").setMaster("local[2]");
    JavaStreamingContext ssc= new JavaStreamingContext(conf, Seconds.apply(1));

    JavaDStream<String> line = ssc.textFileStream(base);
    line.map(new Function<String, Integer>()
    {
        @Override
        public Integer call(String v1) throws Exception
        {
            System.out.println(v1);
            int l =  v1.length();
            return l;
        }
    });

    line.print();

    ssc.start();
    ssc.awaitTermination();

In c:/test is a log file that generates with log back. Its content is :

INFO:Data=Do Save Entity
INFO:Data=Do Delete Entity

but when I run my app, following result print in the console:

18/02/18 19:55:30 INFO JobScheduler: Added jobs for time 1518971130000 ms
18/02/18 19:55:30 INFO JobScheduler: Starting job streaming job 1518971130000 ms.0 from job set of time 1518971130000 ms
18/02/18 19:55:30 INFO JobScheduler: Finished job streaming job 1518971130000 ms.0 from job set of time 1518971130000 ms
18/02/18 19:55:30 INFO JobScheduler: Total delay: 0.291 s for time 1518971130000 ms (execution: 0.002 s)
-------------------------------------------
Time: 1518971130000 ms
-------------------------------------------

18/02/18 19:55:30 INFO FileInputDStream: Cleared 0 old files that were older than 1518971070000 ms: 
18/02/18 19:55:30 INFO ReceivedBlockTracker: Deleting batches: 
18/02/18 19:55:30 INFO InputInfoTracker: remove old batch metadata: 
18/02/18 19:55:31 INFO FileInputDStream: Finding new files took 16 ms
18/02/18 19:55:31 INFO FileInputDStream: New files at time 1518971131000 ms:

-------------------------------------------
Time: 1518971131000 ms
-------------------------------------------

and this output continues. My aim is simple: stream a log file and then print its content in the console, of course, this is temporary because finally, I want to save the file in the database.

The reason why you don't see any output is that JavaStreamingContext.textFileStream monitors a directory for newly created files ( docs ) and does not react on changed files. Some ideas how to deal with the situation you describe are mentioned here .

A second (unrelated) issue in your code is that the call to line.map returns a new JavaDStream on which you should call print to see the result of the transformation. Calling print directly on line will show you the contents of the stream without transformation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM