[英]Create Batches from Object collection
I have a collection of objects (more than 500 in count) I would like to batch it with some logic, like a batch of 50's and I should get 10 sets.(10*50=500) 我有一个对象集合(计数超过500个),我想用某种逻辑对其进行批处理,例如一批50个,我应该得到10套。(10 * 50 = 500)
I am using the logic below: 我正在使用以下逻辑:
public class CustomEngineReader :IEnumerable<List<EngineToken>>
{
StreamReader sr;
int _batchSize = 1;
public CustomFileReader(List<EngineToken> tokens, int batchSize)
{
if (batchSize > 0)
{
_batchSize = batchSize;
}
}
public IEnumerator<List<string>> GetEnumerator()
{
string input = string.Empty;
foreach(var item in EngineTokens)
{
int i = 0;
List<string> batch = new List<string>();
while (i < _batchSize && item !=null)
{
batch.Add(item );
i++;
}
if (batch.Count != 0)
{
yield return batch;
}
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
I am using the above code as below 我正在使用上面的代码如下
CustomEngineReader reader = new CustomEngineReader (this.tokencollection, 50);
foreach(List<EngineToken> items in reader)
{
//in each iteration we get batch specified objects
foreach(EngineToken item in items)
{
//Process
}
}
It's not working. 没用
Having a while
loop inside of your foreach
means that you're adding the same item over and over again until you hit your batch size. 在
foreach
内部有一个while
循环意味着您要一遍又一遍地添加相同的项目,直到达到批量大小为止。
You also need to create your batch
object outside of the foreach
, so that you can add multiple items from the loop into it. 您还需要在
foreach
之外创建batch
对象,以便可以将循环中的多个项目添加到其中。
Personally I prefer to write this as an extension method, rather than a separate class, to follow in line with the LINQ style of programming. 就个人而言,我更喜欢将其编写为扩展方法,而不是单独的类,以遵循LINQ编程风格。 It also can be trivially made generic, greatly improving its usefulness.
也可以将其微不足道地泛化,从而大大提高其实用性。 My implementation of
Batch
is: 我对
Batch
实现是:
public static IEnumerable<IEnumerable<T>> Batch<T>(
this IEnumerable<T> source, int batchSize)
{
List<T> buffer = new List<T>(batchSize);
foreach (T item in source)
{
buffer.Add(item);
if (buffer.Count >= batchSize)
{
yield return buffer;
buffer = new List<T>(batchSize);
}
}
if (buffer.Count > 0)
{
yield return buffer;
}
}
You are adding same item multiple times in a loop. 您正在循环中多次添加同一项目。 Try:
尝试:
public IEnumerator<List<EngineToken>> GetEnumerator()
{
string input = string.Empty;
int i=0;
List<string> batch = new List<string>();
foreach(var item in EngineTokens)
{
batch.Add(item);
i++;
if(i==_batchSize)
{
yield return batch;
batch = new List<string>();
i = 0;
}
}
if (batch.Count != 0)
{
yield return batch;
}
}
Here is a simpler implementation, non-thread safe. 这是一个更简单的实现,非线程安全。 For thread safety you will need to add locks as necessary.
为了线程安全,您需要根据需要添加锁。
public IEnumerator<List<EngineToken> GetEnumerator()
{
var currentBatch = EngineTokens.Take(_batchSize);
EngineTokens = EngineTokens.Skip(_batchSize).ToList();
return currentBatch;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.