简体   繁体   English

从Google Cloud上的网页存储高流量事件

[英]Storing high traffic events from web pages on Google Cloud

I am trying to build (to keep it as simple as possible) "something like Google Analytics". 我正在尝试构建(使其尽可能简单)“类似于Google Analytics(分析)”。 It means: I want to store objects with few informations (size <2KB) from web page into some storage and be able to query it. 这意味着:我想将来自网页的信息很少(小于2KB)的对象存储到某个存储器中并能够对其进行查询。

I have JS code that sends those event objects to the PHP endpoint on Google App Engine. 我有JS代码,可将这些事件对象发送到Google App Engine上的PHP端点。 This endpoint then inserts it into Google BigQuery. 然后,此端点将其插入Google BigQuery。 Here comes my problem: the insertion is done via Google API PHP library - REST request. 我的问题来了:插入是通过Google API PHP库-REST请求完成的。 So it performs HTTP request which is very slow. 因此,它执行非常慢的HTTP请求。

My question here is: is there a better way to store the events in Google Cloud environment? 我的问题是:是否有更好的方法将事件存储在Google Cloud环境中? Is better (and more cost effective?) to use PubSub or Redis for storing events there and have some workers in the background that loads this queue to the BigQuery? 使用PubSub或Redis在此处存储事件并在后台有一些工作程序将此队列加载到BigQuery更好(并且更具成本效益?)?

Any idea how to do this as efficient (both in performance and cost) as possible would be greatly appreciated! 任何想法如何做到尽可能高效(性能和成本)都将不胜感激!

If I had to do this I would first make the endpoint handler save the raw data into a push queue , because enqueuing stuff is relatively fast on App Engine. 如果必须这样做,我将首先使端点处理程序将原始数据保存到推送队列中 ,因为在App Engine上将内容排队相对较快。 The processing of the data and the BigQuery API calls would be done later in the task queue. 稍后将在任务队列中完成数据和BigQuery API调用的处理。

I guess the performance and cost you can get also vary a bit depending on the App Engine language(PHP,Go,Java,...). 我猜您可以获得的性能和成本也会有所不同,具体取决于App Engine语言(PHP,Go,Java等)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM