[英]How to automate running a command on Spark SQL or Scala Shell every 15 mins?
[英]how to use spark to analyze pv,uv,ip every 5 mins
如何每天每5分鍾分析一次uv,pv,ip,並存儲Mysql。 數據來自Kafka,格式如下:
Message sent: {"cookie":"a95f22eabc4fd4b580c011a3161a9d9d","ip":"125.119.144.252","event_time":"2017-08-07 10:50:16"}
Message sent: {"cookie":"6b67c8c700427dee7552f81f3228c927","ip":"202.109.201.181","event_time":"2017-08-07 10:50:26"}
就像00:00-00:05 00:05--00:10等等,我用過:
val write=new JDBCSink()
val query=counts.writeStream.foreach(write).outputMode("complete")
.trigger(ProcessingTime("5 minutes"))
.start()
但是當我在00:01提交它或它崩潰時,如何確定它不會像00:01-00:06這樣分析。
使用window
功能:
query = counts.groupBy(window('event_time', '5 second')).agg()
query.writeStream.start()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.