简体   繁体   中英

Azure Cloud Services

I was asked to rewrite a certain program in C# using Windows Azure.

The program right now spawns numerous threads to complete tasks it recieves as data rows from database (new ones appear constantly). The number of threads will have to be dynamic (to maximize efficiency) but the exact code executed by each depends on the job type, each type has a separate class to handle the data.

Looking at Azure I think the best place to put this program would be Worker Threads in Cloud Services. In general is that a good place?

If so should one worker spawn numerous threads/tasks or should I spawn numerous workers?

Thank you in advance.

This question doesn't really have an easy answer as it depends heavily on the nature of the work being done and your need for fast deployments and/or requirements for control over the environment on the machine. I agree that a Worker role in Cloud Services is a good option to look at, however, you might also be able to do the work using Windows Azure Web Sites with the new "always on" feature that was announced last week. You can write a web application that pulls the jobs with a background thread and spawn new threads as needed. The web sites approach would likely work best in Standard mode where you are given a dedicated VM, but at that point it might just be better to go the Cloud Services route. Cloud Services is slightly cheaper per hour than Web Sites in Standard mode for the same sized machine; however, web services offers a faster deployment mechanism. There are many questions to answer before I'd say either one is better for you. For now we'll assume you go the route of the Worker role in Cloud Services.

Ideally you want to utilize each machine as much as possible without completely maxing it out before you spin up another instance. If your jobs are heavy CPU bound then if you have just a couple of cores on the box you won't be able to handle as many jobs at the same time on one machine. The jobs of a CPU bound job have to chug through as fast as they can, but the CPUs are all pegged at 100% and having more jobs on one machine won't help once all CPUs are utilized. Thrashing starts to occur as multiple jobs all compete for the CPU. In this case having multiple instances will help you scale. However, if you have an IO bound job, so something that is reading/writing to the database, BLOB storage, etc. then the CPU is mostly idle and thus you can handle a LOT of these types of jobs all on the same machine until you saturate the network capacity.

You mentioned that each thread might get a different "job" which means it may have different characteristics, which in turn makes the decision a little harder. The best thing I can say is to start measuring the characteristics of each job, is it CPU bound, IO bound, etc. and look at what makes the most sense. This may mean that you identify jobs that are one type or another and segment them so that IO bound jobs only run on one set of workers while the CPU bound one run on another set. This would let you scale them independently.

You can also see how multi-core machines help you as well; note that the scale up is pretty linear so that a 2 core box costs half as much as a 4 core box. So while the inclination might be to get the biggest box you can, if you end up needing 10 cores to do the work it might be better to have ten 1 core instances running rather than three 4 core boxes running because you'll get a finer grain of scalability and if one machine goes offline you don't loose as much capacity. I will point out that the larger the box you select the more network capacity you are assigned as well, so if your bottleneck is network bandwidth, you need a bigger box.

Whatever you do, don't simply run one thread on a Worker Role and scale those as needed. This will likely be a complete waste of resources. Measure your workloads and make the best decision for your scenario.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM