[英]Saving Data on GAE: logging vs. datastore
I have a google app engine app that has to deal with a lot of data collecting. 我有一个Google App Engine应用程序,它必须处理大量数据收集。 The data I gather is around millions of records per day. 我每天收集的数据约为数百万条记录。 As I see it, there are two simple approaches to dealing with this in order to be able to analyze the data: 如我所见,有两种简单的方法可以处理此数据,以便能够分析数据:
Is there any preferable method of doing this? 有什么更好的方法吗?
Thanks! 谢谢!
BigQuery has a new Streaming API , which they claim was designed for high-volume real-time data collection. BigQuery拥有一个新的Streaming API ,他们声称该API是为大批量实时数据收集而设计的。
Advice from practice: we are currently logging 20M+ multi-event records a day via a method 1. as described above. 实践建议:如上所述,我们目前每天通过方法1记录20M +多事件记录。 It works pretty well, except when the batch uploader is not called (normally every 5min), then we need to detect this and re-run the importer. 它工作得很好,除了不调用批处理上传器(通常每5分钟)时,我们需要检测到此情况并重新运行导入器。 Also, we are currently in process of migrating to new Streaming API, but is not yet in production so I can't say how reliable it is. 另外,我们目前正在迁移到新的Streaming API,但尚未投入生产,因此我不能说它的可靠性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.