[英]How to schedule jobs without overlap using LINQ to Objects?
This is another resource-allocation problem. 这是另一个资源分配问题。 My goal is to run a query to assign the top-priority job for any time-slot to one of two CPU cores (just an example, so let's assume no interrupts or multi-tasking).
我的目标是运行一个查询,以将任何时隙的最高优先级作业分配给两个CPU内核之一(仅作为示例,因此我们假设没有中断或多任务处理)。 Note: this is similar to my earlier post about partitioning , but focuses on overlapping times and assigning multiple items, not just the top-priority item.
注意:这类似于我先前关于分区的文章 ,但重点是重叠时间和分配多个项目,而不仅仅是最优先的项目。
Here is our object: 这是我们的对象:
public class Job
{
public int Id;
public int Priority;
public DateTime Begin;
public DateTime End;
}
The real dataset is very large, but for this example, let's say there are 1000 jobs to be assigned to two CPU cores. 实际数据集非常大,但是对于本示例,假设有1000个作业要分配给两个CPU内核。 They are all loaded into memory, and I need to run a single LINQ to Objects query against them.
它们都已加载到内存中,我需要针对它们运行单个LINQ to Objects查询。 This is currently taking almost 8 seconds and 1.4 million comparisons.
目前,这花费了将近8秒和140万次比较。
I have leveraged the logic cited in this post to determine whether two items are overlapping, but unlike that post, I don't simply need to find overlapping items, but to schedule the top item of any overlapping set, and then schedule the next one. 我已经利用所引用的逻辑这篇文章 ,以确定两个项目是否是重叠的,但不像那个帖子,我不只是需要找到重叠的项目,但安排任何重叠集的最高项,然后安排下一个。
Before I get to the code, let me point out the steps of the current inneficient algorithm: 在阅读代码之前,让我指出当前无效算法的步骤:
Questions: 问题:
Full Sample Code: 完整的示例代码:
public class Job
{
public static long Iterations;
public int Id;
public int Priority;
public DateTime Begin;
public DateTime End;
public bool Overlaps(Job other)
{
Iterations++;
return this.End > other.Begin && this.Begin < other.End;
}
}
public class Assignment
{
public Job Job;
public int Core;
}
class Program
{
static void Main(string[] args)
{
const int Jobs = 1000;
const int Cores = 2;
const int ConcurrentJobs = Cores + 1;
const int Priorities = Cores + 3;
DateTime startTime = new DateTime(2011, 3, 1, 0, 0, 0, 0);
Console.WriteLine(string.Format("{0} Jobs x {1} Cores", Jobs, Cores));
var timer = Stopwatch.StartNew();
Console.WriteLine("Populating data");
var jobs = new List<Job>();
for (int jobId = 0; jobId < Jobs; jobId++)
{
var jobStart = startTime.AddHours(jobId / ConcurrentJobs).AddMinutes(jobId % ConcurrentJobs);
jobs.Add(new Job() { Id = jobId, Priority = jobId % Priorities, Begin = jobStart, End = jobStart.AddHours(0.5) });
}
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
timer.Restart();
Console.WriteLine("Assigning Jobs to Cores");
IEnumerable<Assignment> assignments = null;
for (int core = 0; core < Cores; core++)
{
// avoid modified closures by creating local variables
int localCore = core;
var localAssignments = assignments;
// Step 1: Determine the remaining jobs
var remainingJobs = localAssignments == null ?
jobs :
from j in jobs where !(from a in localAssignments select a.Job).Contains(j) select j;
// Step 2: Assign the top priority job in any time-slot to the core
var assignmentsForCore = from s1 in remainingJobs
where
(from s2 in remainingJobs
where s1.Overlaps(s2)
orderby s2.Priority
select s2).First().Equals(s1)
select new Assignment { Job = s1, Core = localCore };
// Step 3: Accumulate the results (unfortunately requires a .ToList() to avoid massive over-joins)
assignments = assignments == null ? assignmentsForCore.ToList() : assignments.Concat(assignmentsForCore.ToList());
}
// This is where I'd like to Execute the query one single time across all cores, but have to do intermediate steps to avoid massive-over-joins
assignments = assignments.ToList();
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
Console.WriteLine("\nJobs:");
foreach (var job in jobs.Take(20))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3}", job.Begin, job.End, job.Id, job.Priority));
}
Console.WriteLine("\nAssignments:");
foreach (var assignment in assignments.OrderBy(a => a.Job.Begin).Take(10))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3} C{4}", assignment.Job.Begin, assignment.Job.End, assignment.Job.Id, assignment.Job.Priority, assignment.Core));
}
Console.WriteLine(string.Format("\nTotal Comparisons: {0:n}", Job.Iterations));
Console.WriteLine("Any key to continue");
Console.ReadKey();
}
}
Sample Output: 样本输出:
1000 Jobs x 2 Cores
1000个工作x 2核
Populating data填充数据
Completed in 0.00ms在0.00ms内完成
Assigning Jobs to Cores将作业分配给核心
Completed in 7,998.00ms完成于7,998.00ms
Jobs:工作:
3/1/2011 12:00:00 AM-3/1/2011 12:30:00 AM Id 0 P03/1/2011 12:00:00 AM-3 / 1/2011 12:30:00 AM Id 0 P0
3/1/2011 12:01:00 AM-3/1/2011 12:31:00 AM Id 1 P13/1/2011 12:01:00 AM-3 / 1/2011 12:31:00 AM Id 1 P1
3/1/2011 12:02:00 AM-3/1/2011 12:32:00 AM Id 2 P23/1/2011 12:02:00 AM-3 / 1/2011 12:32:00 AM Id 2 P2
3/1/2011 1:00:00 AM-3/1/2011 1:30:00 AM Id 3 P33/1/2011 1:00:00 AM-3 / 1/2011 1:30:00 AM Id 3 P3
3/1/2011 1:01:00 AM-3/1/2011 1:31:00 AM Id 4 P43/1/2011 1:01:00 AM-3 / 1/2011 1:31:00 AM Id 4 P4
3/1/2011 1:02:00 AM-3/1/2011 1:32:00 AM Id 5 P02011/3/1 1:02:00 AM-3 / 1/2011 1:32:00 AM Id 5 P0
3/1/2011 2:00:00 AM-3/1/2011 2:30:00 AM Id 6 P12011/3/1 2:00:00 AM-3/1/2011 2:30:00 Id 6 P1
3/1/2011 2:01:00 AM-3/1/2011 2:31:00 AM Id 7 P23/1/2011 2:01:00 AM-3 / 1/2011 2:31:00 AM Id 7 P2
3/1/2011 2:02:00 AM-3/1/2011 2:32:00 AM Id 8 P33/1/2011 2:02:00 AM-3 / 1/2011 2:32:00 AM Id 8 P3
3/1/2011 3:00:00 AM-3/1/2011 3:30:00 AM Id 9 P42011/3/1 3:00:00 AM-3/1/2011 3:30:00 Id 9 P4
3/1/2011 3:01:00 AM-3/1/2011 3:31:00 AM Id 10 P02011/3/1 3:01:00 AM-3 / 1/2011 3:31:00 Id 10 P0
3/1/2011 3:02:00 AM-3/1/2011 3:32:00 AM Id 11 P13/1/2011 3:02:00 AM-3 / 1/2011 3:32:00 AM Id 11 P1
3/1/2011 4:00:00 AM-3/1/2011 4:30:00 AM Id 12 P22011/3/1 4:00:00 AM-3/1/2011 4:30:00 Id 12 P2
3/1/2011 4:01:00 AM-3/1/2011 4:31:00 AM Id 13 P32011/3/1 4:01:00 AM-3 / 1/2011 4:31:00 AM Id 13 P3
3/1/2011 4:02:00 AM-3/1/2011 4:32:00 AM Id 14 P43/1/2011 4:02:00 AM-3 / 1/2011 4:32:00 AM Id 14 P4
3/1/2011 5:00:00 AM-3/1/2011 5:30:00 AM Id 15 P02011/3/1 5:00:00 AM-3 / 1/2011 5:30:00 Id 15 P0
3/1/2011 5:01:00 AM-3/1/2011 5:31:00 AM Id 16 P13/1/2011 5:01:00 AM-3 / 1/2011 5:31:00 AM Id 16 P1
3/1/2011 5:02:00 AM-3/1/2011 5:32:00 AM Id 17 P23/1/2011 5:02:00 AM-3 / 1/2011 5:32:00 AM Id 17 P2
3/1/2011 6:00:00 AM-3/1/2011 6:30:00 AM Id 18 P32011/3/1 6:00:00 AM-3 / 1/2011 6:30:00 AM Id 18 P3
3/1/2011 6:01:00 AM-3/1/2011 6:31:00 AM Id 19 P42011年3月1日6:01:00 AM至2011年3月1日6:31:00 AM Id 19 P4
Assignments:作业:
3/1/2011 12:00:00 AM-3/1/2011 12:30:00 AM Id 0 P0 C02011/3/1 12:00:00 AM-3 / 1/2011 12:30:00 AM Id 0 P0 C0
3/1/2011 12:01:00 AM-3/1/2011 12:31:00 AM Id 1 P1 C12011/3/1 12:01:00 AM-3 / 1/2011 12:31:00 AM Id 1 P1 C1
3/1/2011 1:00:00 AM-3/1/2011 1:30:00 AM Id 3 P3 C12011年3月1日1:00:00 AM-3 / 1/2011年1月3日AM Id 3 P3 C1
3/1/2011 1:02:00 AM-3/1/2011 1:32:00 AM Id 5 P0 C02011/3/1 1:02:00 AM-3 / 1/2011 1:32:00 AM Id 5 P0 C0
3/1/2011 2:00:00 AM-3/1/2011 2:30:00 AM Id 6 P1 C02011/3/1 2:00:00 AM-3/1/2011 2:30:00 Id 6 P1 C0
3/1/2011 2:01:00 AM-3/1/2011 2:31:00 AM Id 7 P2 C13/1/2011 2:01:00 AM-3 / 1/2011 2:31:00 AM Id 7 P2 C1
3/1/2011 3:01:00 AM-3/1/2011 3:31:00 AM Id 10 P0 C02011/3/1 3:01:00 AM-3 / 1/2011 3:31:00 Id 10 P0 C0
3/1/2011 3:02:00 AM-3/1/2011 3:32:00 AM Id 11 P1 C13/1/2011 3:02:00 AM-3 / 1/2011 3:32:00 AM Id 11 P1 C1
3/1/2011 4:00:00 AM-3/1/2011 4:30:00 AM Id 12 P2 C02011/3/1 4:00:00 AM-3/1/2011 4:30:00 Id 12 P2 C0
3/1/2011 4:01:00 AM-3/1/2011 4:31:00 AM Id 13 P3 C13/1/2011 4:01:00 AM-3 / 1/2011 4:31:00 AM Id 13 P3 C1
3/1/2011 5:00:00 AM-3/1/2011 5:30:00 AM Id 15 P0 C02011/3/1 5:00:00 AM-3 / 1/2011 5:30:00 AM Id 15 P0 C0
Total Comparisons: 1,443,556.00总计比较:1,443,556.00
Any key to continue任何键继续
Is there a reason for using linq to object collections for this task? 是否有理由使用linq来完成此任务的对象集合? I think that I would create an active list, put all of the jobs in a queue and pop the next one out of the queue whenever the active list dipped below 10 and stick it into the active list.
我认为我将创建一个活动列表,将所有作业放入队列,并在活动列表浸入10以下时将下一个作业弹出队列,然后将其粘贴到活动列表中。 It's easy enough to track which core is executing which task and assign the next task in the queue the the least busy core.
跟踪哪个内核正在执行哪个任务并为队列中的下一个任务分配最不繁忙的内核就很容易了。 Wire up a finished event to the job or just monitor the active list and you'll know when it's appropriate to pop another job off the queue and into the active list.
将完成的事件关联到作业,或者仅监视活动列表,您就会知道何时将另一个作业从队列弹出并进入活动列表是合适的。
I would rather do it in a single loop. 我宁愿在一个循环中进行操作。 My produces a different result from yours.
我的结果与您的结果不同。 Yours scheduled 2/3 of all the jobs.
您安排了所有工作的2/3。 Mine scheduled all.
我的全部预定了。 I will add explanations later.
稍后我将添加说明。 Going off for an appointment now.
现在要去约会。
public class Job
{
public static long Iterations;
public int Id;
public int Priority;
public DateTime Begin;
public DateTime End;
public bool Overlaps(Job other)
{
Iterations++;
return this.End > other.Begin && this.Begin < other.End;
}
}
public class Assignment : IComparable<Assignment>
{
public Job Job;
public int Core;
#region IComparable<Assignment> Members
public int CompareTo(Assignment other)
{
return Job.Begin.CompareTo(other.Job.Begin);
}
#endregion
}
class Program
{
static void Main(string[] args)
{
const int Jobs = 1000;
const int Cores = 2;
const int ConcurrentJobs = Cores + 1;
const int Priorities = Cores + 3;
DateTime startTime = new DateTime(2011, 3, 1, 0, 0, 0, 0);
Console.WriteLine(string.Format("{0} Jobs x {1} Cores", Jobs, Cores));
var timer = Stopwatch.StartNew();
Console.WriteLine("Populating data");
var jobs = new List<Job>();
for (int jobId = 0; jobId < Jobs; jobId++)
{
var jobStart = startTime.AddHours(jobId / ConcurrentJobs).AddMinutes(jobId % ConcurrentJobs);
jobs.Add(new Job() { Id = jobId, Priority = jobId % Priorities, Begin = jobStart, End = jobStart.AddHours(0.5) });
}
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
timer.Reset();
Console.WriteLine("Assigning Jobs to Cores");
List<Assignment>[] assignments = new List<Assignment>[Cores];
for (int core = 0; core < Cores; core++)
assignments[core] = new List<Assignment>();
Job[] lastJobs = new Job[Cores];
foreach (Job j in jobs)
{
Job job = j;
bool assigned = false;
for (int core = 0; core < Cores; core++)
{
if (lastJobs[core] == null || !lastJobs[core].Overlaps(job))
{
// Assign directly if no last job or no overlap with last job
lastJobs[core] = job;
assignments[core].Add(new Assignment { Job = job, Core = core });
assigned = true;
break;
}
else if (job.Priority > lastJobs[core].Priority)
{
// Overlap and higher priority, so we replace
Job temp = lastJobs[core];
lastJobs[core] = job;
job = temp; // Will try to later assign to other core
assignments[core].Add(new Assignment { Job = job, Core = core });
assigned = true;
break;
}
}
if (!assigned)
{
// TODO: What to do if not assigned? Your code seems to just ignore them
}
}
List<Assignment> merged = new List<Assignment>();
for (int core = 0; core < Cores; core++)
merged.AddRange(assignments[core]);
merged.Sort();
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
timer.Reset();
Console.WriteLine(string.Format("\nTotal Comparisons: {0:n}", Job.Iterations));
Job.Iterations = 0; // Reset to count again
{
IEnumerable<Assignment> assignments2 = null;
for (int core = 0; core < Cores; core++)
{
// avoid modified closures by creating local variables
int localCore = core;
var localAssignments = assignments2;
// Step 1: Determine the remaining jobs
var remainingJobs = localAssignments == null ?
jobs :
from j in jobs where !(from a in localAssignments select a.Job).Contains(j) select j;
// Step 2: Assign the top priority job in any time-slot to the core
var assignmentsForCore = from s1 in remainingJobs
where
(from s2 in remainingJobs
where s1.Overlaps(s2)
orderby s2.Priority
select s2).First().Equals(s1)
select new Assignment { Job = s1, Core = localCore };
// Step 3: Accumulate the results (unfortunately requires a .ToList() to avoid massive over-joins)
assignments2 = assignments2 == null ? assignmentsForCore.ToList() : assignments2.Concat(assignmentsForCore.ToList());
}
// This is where I'd like to Execute the query one single time across all cores, but have to do intermediate steps to avoid massive-over-joins
assignments2 = assignments2.ToList();
Console.WriteLine(string.Format("Completed in {0:n}ms", timer.ElapsedMilliseconds));
Console.WriteLine("\nJobs:");
foreach (var job in jobs.Take(20))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3}", job.Begin, job.End, job.Id, job.Priority));
}
Console.WriteLine("\nAssignments:");
foreach (var assignment in assignments2.OrderBy(a => a.Job.Begin).Take(10))
{
Console.WriteLine(string.Format("{0}-{1} Id {2} P{3} C{4}", assignment.Job.Begin, assignment.Job.End, assignment.Job.Id, assignment.Job.Priority, assignment.Core));
}
if (merged.Count != assignments2.Count())
System.Console.WriteLine("Difference count {0}, {1}", merged.Count, assignments2.Count());
for (int i = 0; i < merged.Count() && i < assignments2.Count(); i++)
{
var a2 = assignments2.ElementAt(i);
var a = merged[i];
if (a.Job.Id != a2.Job.Id)
System.Console.WriteLine("Difference at {0} {1} {2}", i, a.Job.Begin, a2.Job.Begin);
if (i % 100 == 0) Console.ReadKey();
}
}
Console.WriteLine(string.Format("\nTotal Comparisons: {0:n}", Job.Iterations));
Console.WriteLine("Any key to continue");
Console.ReadKey();
}
}
Removed due to major bug. 由于重大错误已删除。 Reworking on it.. :P
正在重做..:P
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.