简体   繁体   中英

Leveraging heroku in node.js

I've been looking for a PaaS provider for some time, nodejitsu seemed promising but doesn't offer some of the features I'm looking for. I need the ability to process a lot of data quickly for a lot of my requests. I'm off to a good start with node.js but what I'd like to do is fire off tasks to scrape web data, process some statistics(basically a roster) from databased information.

Basically I'm scraping peoples social media(Facebook, twitter, tumblr, etc.) to determine how much presentation they get on my web service, then serve their latest content(image and a short text) to the viewers. In the end this creates a very large amount of operations per request because I need to compare statistics along many different artists.

What I imagine doing is something like this:

  1. Handle request. Serve template.
  2. Launch web scraping task or task(launch one task for each social media or just one for all?)
  3. Launch task to query the database.
  4. Process task output. And respond to an ajax long poll, or via web sockets to serve processed data. Repeat until all tasks are finished.

This is the structure I desire to deploy on heroku, so I can use the processing dynos to free up web dynos so users are never waiting in the dark for a page to load. On high traffic some users may have to wait for the page to populate content, but in most cases the content will start populating soon after the page is rendered. If not the users who just intend to navigate to another page right away aren't stuck waiting for the site to finish responding to do so.

So basically my question is how do I leverage worker dynos to free up web dynos in node? Or is there a better way to do this?

Sorry for any sloppiness, this was type on my tablet.

Yes, Heroku is great for this sort of thing. See https://devcenter.heroku.com/articles/background-jobs-queueing

The missing component in your thinking is the use of a queue. Resque with coffee-resque is probably the most widely used, but Kue is a great option for an all-Node solution. Both run on top of Redis.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM