Should I use WebSockets in a social network app, or will PHP/AJAX suffice?

Question

I would like your opinion on a project. The way I started myself and slowly presenting many gaps and problems either now or in the future they will create big issue.

The system will have a notification system, friends system, message system (private), and in general such systems. All these I have set up with: jQuery, PHP, mysqli round-trips to avoid wordiness. I am getting at what the title says.

If all these do a simple PHP code and post and get methods for 3-4 online users will be amazing! The thing is when I have several users what can I do to make better use of the resources of the server? So I started looking more and and found like this socket.io

I just want someone to tell me who knows more what would be best to look for. Think how the update notification system work now. jQuery with post and repeated every 3-5 seconds, but it is by no means right.

Answer 1

If your goal is to set up a highly scalable notification service, then probably not.

That's not a strict no, because there are other factors than speed to consider, but when it comes to speed, read on.

WebSockets does give the user a consistently open, bi-directional connection that is, by its very nature, very fast. Also, the client doesn't need to request new information; it is sent when either party deems it appropriate to send.

However, the time savings that the connection itself gives is negligible in terms of the costs to generate the content. How many database calls do you make to check for new notifications? How much structured data do you generate to let the client know to change the notification icon? How many times do you read data from disk, or from across the network?

These same costs do not go away when using any WebSocket server; it just makes one mitigation technique more obvious: Keep the user's notification state in memory and update it as notifications change to prevent costly trips to disk, to the database, and across the server's local network.

Known proven techniques to mitigate the time costs of serving quickly changing web content:

Reverse proxy (Varnish-Cache)

Sits on port 80 and acts as a very thin web server. If a request is for something that isn't in the proxy's in-RAM cache, it sends the request on down to a "real" web server. This is especially useful for serving content that very rarely changes, such as your images and scripts, and has edge-side includes for content that mostly remains the same but has some small element that can't be cached... For instance, on an e-commerce site, a product's description, image, etc., may all be cached, but the HTML that shows the contents of a user's cart can't, so is an ideal candidate for an edge-side include.

This will help by greatly reducing the load on your system, since there will be far fewer requests that use disk IO, which is a resource far more limited than memory IO. (A hard drive can't seek for a database resource at the same time it's seeking for a cat jpeg.)

In Memory Key-Value Storage (Memcached)

This will probably give the most bang for your buck, in terms of creating a scalable notification system.

There are other in-memory key-value storage systems out there, but this one has support built right into PHP, not just once, but twice! (In the grand tradition of PHP core development, rather than fixing a broken implementation, they decided to consider the broken version deprecated without actually marking that system as deprecated and throwing the appropriate warnings, etc., that would get people to stop using the broken system. mysql_ v. mysqli_, I'm looking at you...) (Use the memcache d version, not memcache.)

Anyways, it's simple: When you make a frequent database, filesystem, or network call, store the results in Memcached. When you update a record, file, or push data across the network, and that data is used in results stored in Memcached, update Memcached.

Then, when you need data, check Memcached first. If it's not there, then make the long, costly trip to disk, to the database, or across the network.

Keep in mind that Memcached is not a persistent datastore... That is, if you reboot the server, Memcached comes back up completely empty. You still need a persistent datastore, so still use your database, files, and network. Also, Memcached is specifically designed to be a volatile storage, serving only the most accessed and most updated data quickly. If the data gets old, it could be erased to make room for newer data. After all, RAM is fast, but it's not nearly as cheap as disk space, so this is a good tradeoff.

Also, no key-value storage systems are relational databases. There are reasons for relational databases. You do not want to write your own ACID guarantee wrapper around a key-value store. You do not want to enforce referential integrity on a key-value store. A fancy name for a key-value store is a No-SQL database. Keep that in mind: You might get horizontal scalability from the likes of Cassandra, and you might get blazing speed from the likes of Memcached, but you don't get SQL and all the many, many, many decades of evolution that RDBMSs have had.

And, finally:

Don't mix languages

If, after implementing a reverse proxy and an in-memory cache you still want to implement a WebSocket server, then more power to you. Just keep in mind the implications of which server you choose.

If you want to use Socket.io with Node.js, write your entire application in Javascript. Otherwise, choose a WebSocket server that is written in the same language as the rest of your system.

Example of a 1 language solution:

<?php // ~/projects/MySocialNetwork/core/users/myuser.php
class MyUser {
    public function getNotificationCount() {
        // Note: Don't got to the DB first, if we can help it.
        if ($notifications = $memcachedWrapper->getNotificationCount($this->userId) !== null) // 0 is false-ish. Explicitly check for no result.
            return $notifications;
        $userModel = new MyUserModel($this->userId);
        return $userModel->getNotificationCount();
    }
}
... 
<?php // ~/projects/WebSocketServerForMySocialNetwork/eventhandlers.php
function websocketTickCallback() {
    foreach ($connectedUsers as $user) {
        if ($user->getHasChangedNotifications()) {
            $notificationCount = $user->getNotificationCount();
            $contents = json_encode(array('Notification Count' => $notificationCount));
            $message = new WebsocketResponse($user, $contents);
            $message->send();
            $user->resetHasChangedNotifications();
        }
    }
}

If we were using socket.io, we would have to write our MyUser class twice, once in PHP and once in Javascript. Who wants to bet that the classes will implement the same logic in the same ways in both languages? What if two developers are working on the different implementations of the classes? What if a bugfix gets applied to the PHP code, but nobody remembers the Javascript?

Should I use WebSockets in a social network app, or will PHP/AJAX suffice?

Question

1 answers

solution1
0 ACCPTED 2015-07-09 16:48:39

Should I use WebSockets in a social network app, or will PHP/AJAX suffice?

Question

1 answers

solution1 0 ACCPTED 2015-07-09 16:48:39

solution1
0 ACCPTED 2015-07-09 16:48:39