简体   繁体   English

为什么Linq扩展方法不位于IEnumerator而不是IEnumerable?

[英]Why don't the Linq extension methods sit on IEnumerator rather than IEnumerable?

There are lots of Linq algorithms that only need to do one pass through the input eg Select. 许多Linq算法只需要对输入进行一次遍历,例如Select。

Yet all the Linq extension methods sit on IEnumerable rather than IEnumerator 但是,所有Linq扩展方法都位于IEnumerable而非IEnumerator上

    var e = new[] { 1, 2, 3, 4, 5 }.GetEnumerator(); 
    e.Select(x => x * x); // Doesn't work 

This means you can't use Linq in any situation where you are reading from an "already opened" stream. 这意味着在从“已打开”流中读取的任何情况下都不能使用Linq。

This scenario is happening a lot for a project I am currently working on - I want to return an IEnumerator whose IDispose method will close the stream, and have all the downstream Linq code operate on this. 对于我当前正在处理的项目,这种情况经常发生-我想返回一个IEnumerator,其IDispose方法将关闭流,并使所有下游Linq代码对此进行操作。

In short, I have an "already opened" stream of results which I can convert into an appropriately disposable IEnumerator - but unfortunately all of the downstream code requires an IEnumerable rather than an IEnumerator, even though it's only going to do one "pass". 简而言之,我有一个“已经开放”的结果流,可以将其转换为适当的一次性IEnumerator-但不幸的是,所有下游代码都需要IEnumerable而不是IEnumerator,即使它只是要做一个“通过”。

ie I'm wanting to "implement" this return type on a variety of different sources (CSV files, IDataReaders, etc.): 即我想在各种不同的来源(CSV文件,IDataReaders等)上“实现”此返回类型:

class TabularStream 
{ 
    Column[] Columns; 
    IEnumerator<object[]> RowStream; 
}

In order to get the "Columns" I have to have already opened the CSV file, initiated the SQL query, or whatever. 为了获得“列”,我必须已经打开CSV文件,启动SQL查询或其他操作。 I can then return an "IEnumerator" whose Dispose method closes the resource - but all of the Linq operations require an IEnumerable. 然后,我可以返回一个“ IEnumerator”,其Dispose方法关闭资源-但是所有Linq操作都需要一个IEnumerable。

The best workaround I know of is to implement an IEnumerable whose GetEnumerator() method returns the one-and-only IEnumerator and throws an error if something tries to do a GetEnumerator() call twice. 我知道最好的解决方法是实现一个IEnumerable,它的GetEnumerator()方法返回一个和唯一的IEnumerator,并且如果某事尝试两次执行GetEnumerator()调用则抛出错误。

Does this all sound OK or is there a much better way for me to implement "TabularStream" in a way that's easy to use from Linq? 这一切听起来还不错,还是有一种更好的方法可以用Linq易于使用的方式实现“ TabularStream”?

Using IEnumerator<T> directly is rarely a good idea, in my view. 在我看来,直接使用IEnumerator<T>很少是一个好主意。

For one thing, it encodes the fact that it's destructive - whereas LINQ queries can usually be run multiple times. 一方面,它编码了具有破坏性的事实-而LINQ查询通常可以运行多次。 They're meant to be side-effect-free, whereas the act of iterating over an IEnumerator<T> is naturally side-effecting. 它们本来是没有副作用的,但是在IEnumerator<T>进行迭代的行为自然是副作用。

It also makes it virtually impossible to perform some of the optimizations in LINQ to Objects, such as using the Count property if you're actually asking an ICollection<T> for its count. 这实际上使在LINQ to Objects中执行某些优化几乎是不可能的,例如,如果您实际上是在向ICollection<T>询问其计数,则使用Count属性。

As for your workaround: yes, a OneShotEnumerable would be a reasonable approach. 至于您的解决方法:是的, OneShotEnumerable是一种合理的方法。

While I generally agree with Jon Skeet's answer , I have also come across a very few cases where working with IEnumerator indeed seemed more appropriate than wrapping them in a once-only- IEnumerable . 尽管我通常都同意Jon Skeet的回答 ,但我也遇到了很少的情况,与IEnumerator一起工作确实比将它们包装在仅一次的IEnumerable更合适。

I'll start by illustrating one such case and by describing my own solution to the issue. 我将首先说明一个这样的案例,并描述我自己的解决方案。

Case example: Forward-only, non-rewindable database cursors 案例示例:仅转发,不可撤消的数据库游标

ESRI 's API for accessing geo-databases ( ArcObjects ) has forward-only database cursors that cannot be reset. ESRI的用于访问地理数据库( ArcObjects )的API具有只能重置的仅向前数据库游标。 They are essentially that API's equivalent of IEnumerator . 从本质上讲,它们等效于IEnumerator API。 But there is no equivalent to IEnumerable . 但是没有等效于IEnumerable So if you want to wrap that API in "the .NET way", you have three options (which I explored in the following order): 因此,如果您想以“ .NET方式”包装该API,则有三个选项(我按以下顺序进行了探讨):

  1. Wrap the cursor as an IEnumerator (since that's what it really is) and work directly with that (which is cumbersome). 将游标包装为IEnumerator (因为它实际上就是它)并直接使用IEnumerator (这很麻烦)。

  2. Wrap the cursor, or the wrapping IEnumerator from (1), as a once-only IEnumerable (to make it LINQ-compatible and generally easier to work with). 将光标或包装为(1)的IEnumerator包装为仅一次的IEnumerable (使其与LINQ兼容,并且通常更易于使用)。 The mistake here is that it isn't an IEnumerable , because it cannot be enumerated more than once, and this might be overlooked by users or maintainers of your code. 这里的错误是它不是 IEnumerable ,因为它不能被多次枚举,并且代码的用户或维护者可能会忽略它。

  3. Don't wrap the cursor itself as an IEnumerable , but that which can be used to retrieve a cursor (eg the query criteria and the reference to the database object being queried). 光标本身作为IEnumerable ,但其可用于检索光标 (例如查询条件和参照数据库对象被查询)。 That way, several iterations are possible simply be re-executing the whole query. 这样,只需简单地重新执行整个查询,就可以进行多次迭代。 This is what I eventually decided on back then. 这是我当时最终决定的。

That last option is the pragmatic solution that I would generally recommend for similar cases (if applicable). 最后一个选择是实用的解决方案,对于类似情况(如果适用),我通常会建议这样做。 If you are looking for other solutions, read on. 如果您正在寻找其他解决方案,请继续阅读。


Re-implement LINQ query operators for the IEnumerator<T> interface? 重新实现IEnumerator<T>接口的LINQ查询运算符?

It's technically possible to implement some or all of LINQ's query operators for the IEnumerator<T> interface. 从技术上讲,可以为IEnumerator<T>接口实现LINQ的部分或全部查询运算符。 One approach would be to write a bunch of extension methods, such as: 一种方法是编写一堆扩展方法,例如:

public static IEnumerator<T> Where(this IEnumerator<T> xs, Func<T, bool> predicate)
{
    while (xs.MoveNext())
    {
        T x = xs.Current;
        if (predicate(x)) yield return x;
    }
    yield break;
}

Let's consider a few key issues: 让我们考虑一些关键问题:

  • Operators must never return an IEnumerable<T> , because that would mean that you can break out of your own "LINQ to IEnumerator " world and escape into regular LINQ. 运算符绝不能返回IEnumerable<T> ,因为这意味着您可以突破自己的“ LINQ to IEnumerator ”世界,转而进入常规的LINQ。 There you'd end up with the non-repeatability issue already described above. 到此为止,您将遇到上面已经描述的不可重复性问题。

  • You cannot process the results of some query with a foreach loop… unless each of the IEnumerator<T> objects returned by your query operators implements a GetEnumerator method that returns this . 您无法使用foreach循环来处理某些查询的结果…除非查询运算符返回的每个IEnumerator<T>对象都实现了返回thisGetEnumerator方法。 Supplying that additional method would mean that you cannot use yield return/break , but have to write IEnumerator<T> classes manually. 提供该其他方法将意味着您不能使用yield return/break ,而必须手动编写IEnumerator<T>类。

    This is just plain weird and possibly an abuse of either IEnumerator<T> or the foreach construct. 这只是很奇怪,并且可能滥用了IEnumerator<T>foreach构造。

  • If returning IEnumerable<T> is forbidden and returning IEnumerator<T> is cumbersome (because foreach doesn't work), why not return plain arrays? 如果返回IEnumerable<T>是被禁止的并返回IEnumerator<T>是麻烦的(因为foreach不工作),为什么不返回纯数组? Because then queries can no longer be lazy. 因为这样查询不再是懒惰的。


IQueryable + IEnumerator = IQueryator IQueryable + IEnumerator = IQueryator

What about delaying the execution of a query until it has been fully composed? 将查询的执行推迟到完全组成该怎么办? In the IEnumerable world, that is what IQueryable does; IEnumerable世界中, IQueryable就是这样做的。 so we could theoretically build an IEnumerator equivalent, which I shall call IQueryator . 因此,从理论上讲,我们可以构建一个IEnumerator等效项,我将其称为IQueryator

  • IQueryator could check for logical errors, such as doing anything with the sequence after it has been completely consumed by a preceding operation like Count . IQueryator可以检查逻辑错误,例如在诸如Count类的先前操作完全消耗完序列后,对该序列执行任何操作。 Ie all-consuming operators like Count would always have to be the last in a query operator concatenation. 也就是说,像Count这样的所有消耗大量运算符都必须始终是查询运算符串联中的最后一个。

  • IQueryator could return an array (like suggested above) or some other read-only collection, but not by the indiviual operators; IQueryator可以返回一个数组(如上面的建议)或其他只读集合,但不能由单个运算符返回; only when the query gets executed. 仅在查询执行时。

Implementing IQueryator would take quite some time... the question is, would it actually be worth the effort? 实施IQueryator需要花费一些时间...问题是,实际上值得付出努力吗?

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么IEnumerable <T> 有IEnumerator时需要 <T> ? - Why is IEnumerable<T> necessary when there is IEnumerator<T>? 以下为什么不工作? (IEnumerable / IEnumerator) - Why doesn't the following work? (IEnumerable/ IEnumerator) 为什么在RepeaterItemCollection上没有LINQ扩展方法,尽管它实现了IEnumerable? - Why are there no LINQ extension methods on RepeaterItemCollection despite the fact that it implements IEnumerable? 返回IEnumerable的迭代器方法之间有什么区别吗? <T> 和IEnumerator <T> ? - Is there any difference between iterator methods returning IEnumerable<T> and IEnumerator<T>? 为什么使用 .AsEnumerable() 而不是转换为 IEnumerable<T> ? - Why use .AsEnumerable() rather than casting to IEnumerable<T>? IEnumerable 上的自定义扩展方法<t></t> - Custom extension methods on IEnumerable<T> .NET框架中是否存在不适用于LINQ的内置扩展方法? - Are there built-in extension methods in the .NET framework that don't apply to LINQ? 无法在从实现IEnumerable的类派生的类上编译LINQ扩展方法<out T> - Can't compile LINQ extension methods on classes that derive from a class implementing IEnumerable<out T> 我可以用IEnumerable编写此lambda吗 <T> 而不是IEnumerable <string> ? - Can I write this lambda with IEnumerable<T> rather than IEnumerable<string>? 为什么 LINQ 运算符被定义为 IEnumerable 接口上的扩展方法而不是接口本身的一部分? C# - Why are LINQ operators defined as extension methods on IEnumerable interface and are not part of the interface itself? C#
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM