简体   繁体   中英

Building an xmpp server for upstream google gcm

Let's say have an app that has 10s of millions of installs and 10s of thousands of active users at a given point of time. I need to log my users' activity data to my servers. Currently, I make HTTP requests from the device to my servers. I have a bunch of machines running a web server, sitting behind amazon's ELB. They parse the data coming from the devices and put it in mongodb.

Now, I would like to capture device data by using upstream CCS provided by Google' GCM (so that I can piggyback on GCM for more reliable delivery of data) I have written a prototype XMPP server and I can make whole thing work, but I am worried about scaling it up. What will happen if Google starts sending me messages at a rate faster than I can consume? Earlier, I was able to use multiple servers behind load balancer to tackle high request rate. Is there a concept of load balancing here?

If I open multiple connections from my server to Google's server (Google says I can have till 1000 connections for a given sender id), will the incoming requests be load balanced between these connections?

Finally, is there recommended solution which takes care of solving most of the problems above? Will using ejabberd solve some of the problems above?

Thanks a bunch.

What will happen if Google starts sending me messages at a rate faster than I can consume?

At the end https://developers.google.com/cloud-messaging/ccs you may read

Conversely, to avoid overloading the app server, CCS stops sending if there are too many unacknowledged messages. Therefore, the app server should "ACK" upstream messages, received from the client application via CCS, as soon as possible to maintain a constant flow of incoming messages. The aforementioned pending message limit doesn't apply to these ACKs. Even if the pending message count reaches 100, the app server should continue sending ACKs for messages received from CCS to avoid blocking delivery of new upstream messages.

In the same document, you find partial answer to your second and third questions

If at any point the connection fails, you should immediately reconnect. There is no need to back off after a disconnect that happens after authentication.

For me it means, that Google implemented a simple redundancy logic and probably not a fair load balancing system (anyway I hope so). If you have that high volumes, I suggest you to contact them directly.

For the last ones, ejabberd is a good product, there are a lot of deployed systems with a clustered infrastructure and a plenty of documents on how do taht. I suggest you to start from here http://docs.ejabberd.im/admin/guide/clustering/ .

Anyway, for your high volumes I would evaluate RabbitMQ which is another Erlang jewel.

ejabberd can be clustered and placed behind a load balancer to distribute connections. A 3 or 4 server cluster should be able to handle that load fine and give you fail over protection. You can add servers if needed. Once you get close to 10 servers you may want to consider using Redis for the in memory DB rather than mnesia.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM