[英]Spark streaming one log file with java doesn't generate any output
我想通过 java 和 spark 流式传输日志文件。 我的代码很简单:
String base = "c:/test";
SparkConf conf = new SparkConf().setAppName("First_App").setMaster("local[2]");
JavaStreamingContext ssc= new JavaStreamingContext(conf, Seconds.apply(1));
JavaDStream<String> line = ssc.textFileStream(base);
line.map(new Function<String, Integer>()
{
@Override
public Integer call(String v1) throws Exception
{
System.out.println(v1);
int l = v1.length();
return l;
}
});
line.print();
ssc.start();
ssc.awaitTermination();
在c:/test
是一个日志文件,它生成日志返回。 它的内容是:
INFO:Data=Do Save Entity
INFO:Data=Do Delete Entity
但是当我运行我的应用程序时,控制台中会打印以下结果:
18/02/18 19:55:30 INFO JobScheduler: Added jobs for time 1518971130000 ms
18/02/18 19:55:30 INFO JobScheduler: Starting job streaming job 1518971130000 ms.0 from job set of time 1518971130000 ms
18/02/18 19:55:30 INFO JobScheduler: Finished job streaming job 1518971130000 ms.0 from job set of time 1518971130000 ms
18/02/18 19:55:30 INFO JobScheduler: Total delay: 0.291 s for time 1518971130000 ms (execution: 0.002 s)
-------------------------------------------
Time: 1518971130000 ms
-------------------------------------------
18/02/18 19:55:30 INFO FileInputDStream: Cleared 0 old files that were older than 1518971070000 ms:
18/02/18 19:55:30 INFO ReceivedBlockTracker: Deleting batches:
18/02/18 19:55:30 INFO InputInfoTracker: remove old batch metadata:
18/02/18 19:55:31 INFO FileInputDStream: Finding new files took 16 ms
18/02/18 19:55:31 INFO FileInputDStream: New files at time 1518971131000 ms:
-------------------------------------------
Time: 1518971131000 ms
-------------------------------------------
这个输出还在继续。 我的目标很简单:流式传输日志文件,然后在控制台中打印其内容,当然,这是临时的,因为最后,我想将文件保存在数据库中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.