简体   繁体   中英

Load balancing and scheduling algorithms

so here is my problem:

I have several different configuarion servers. I have different calculations (jobs); I can predict how long approximately each job will take to be caclulated. Also, I have priorities. My question is how to keep all machines loaded 99-100% and schedule the jobs in the best way.

Each machine can do several calculations at a time. Jobs are pushed to the machine. The central machine knows the current load of each machine. Also, I would like to to assign some kind of machine learning here, because I will know statistics of each job (started, finished, cpu load etc.).

How can I distribute jobs (calculations) in the best possible way, keeping in mind the priorities?

Any suggestions, ideas, or algorithms?

FYI: My platform .NET.

  1. Look at Dryad linq . It already in academic release and may be useful.
  2. Win HPC server - enterprise solution for distributed computing from Microsoft.
  3. Some code samples which can help to build load balancing by analyzing performance counters.
  4. Microsoft has StockTrader sample application (with sources), which is example of distributable SOA with hand-written RoundRobin load balancing.

As an alternative approach, you could use the peak performance ratio estimates of each machine to schedule jobs. This can be very effective only if you are considering CPU runtime performance of a load-balanced system. Issues concerning I/O, size of cluster, network performance, types of memory model etc. are neglected with this approach. Take a look at http://dx.doi.org/10.1145/1513895.1513901

A proposal for more accurate (near load balanced job distribution) approach will be algorithm - computer architecture dependent one. In this case, higher priority job may be scheduled to the best server that meets its demands - but you need to determine first an optimal mapping of jobs to server. You may apply also some methods of OS scheduling algorithms on multiprocessors (not uniprocessors). Hope you'll find this helpful.

Looks like this has very little to do with .NET.

But think of your machines as 'worker threads', make a 'pool' of available machines ordered on available CPU (or other important resource), then use your knowledge of each task to push each job to the best fitted machine.

If you know all the jobs upfront, you could probably use a 'best fit' algorithm to schedule them in the correct order on the correct machines. You could also look at 'cutting stock' algorithms; http://en.wikipedia.org/wiki/Cutting_stock_problem ...

Microsoft recently published a paper on their quincy scheduler. If you are simply optimizing for CPU utilization then a very simple solver can find the global optimum. If you need optimization across more axes then obviously the problem space will be more complicated.

How big is your cluster? How do you deal with optimizing around failure cases? Do they matter? Is there IO? Does data have disk affinity? Is there more than one place to run a piece of a job? All things to consider.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM