简体   繁体   中英

Realtime data processing with Python

I am working on a project which is going to consume data from Twitter Stream API and count certain hashtags. But I have difficulties in understanding what kind architecture I need in my case. Should I use Tornado or is there more suitable frameworks for this?

It really depends on what you want to do with the Tweets. Simply reading a stream of Tweets has not been an issue that I've seen. In fact that can be done on an AWS Micro Instance. I even run more advanced regression algorithms on the real-time feed. The scalability problem arises if you try to process a set of historical Tweets. Since Tweets are produced so fast, processing historical Tweets can be very slow. That's when you should try to parallelize.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM