简体   繁体   English

"在 C# 中追加数组的最有效方法?"

[英]Most efficient way to append arrays in C#?

I am pulling data out of an old-school ActiveX in the form of arrays of doubles.我以双精度数组的形式从老式 ActiveX 中提取数据。 I don't initially know the final number of samples I will actually retrieve.我最初不知道我将实际检索到的最终样本数量。

What is the most efficient way to concatenate these arrays together in C# as I pull them out of the system?当我将它们从系统中拉出时,在 C# 中将这些数组连接在一起的最有效方法是什么?

"

You can't append to an actual array - the size of an array is fixed at creation time.您不能附加到实际数组 - 数组的大小在创建时是固定的。 Instead, use a List<T> which can grow as it needs to.相反,使用可以根据需要增长的List<T>

Alternatively, keep a list of arrays, and concatenate them all only when you've grabbed everything.或者,保留一个数组列表,并仅在您获取所有内容后将它们全部连接起来。

See Eric Lippert's blog post on arrays for more detail and insight than I could realistically provide :)请参阅Eric Lippert 关于数组的博客文章,以获得比我实际提供的更多细节和见解:)

I believe if you have 2 arrays of the same type that you want to combine into a third array, there's a very simple way to do that.我相信如果您有 2 个相同类型的数组想要合并到第三个数组中,那么有一种非常简单的方法可以做到这一点。

here's the code:这是代码:

String[] theHTMLFiles = Directory.GetFiles(basePath, "*.html");
String[] thexmlFiles = Directory.GetFiles(basePath, "*.xml");
List<String> finalList = new List<String>(theHTMLFiles.Concat<string>(thexmlFiles));
String[] finalArray = finalList.ToArray();

Concatenating arrays is simple using linq extensions which come standard with .Net 4使用 .Net 4 标配的 linq 扩展连接数组很简单

Biggest thing to remember is that linq works with IEnumerable<T> objects, so in order to get an array back as your result then you must use the .ToArray() method at the end要记住的最重要的事情是 linq 与IEnumerable<T>对象一起使用,因此为了将数组作为结果返回,您必须在.ToArray()使用.ToArray()方法

Example of concatenating two byte arrays:连接两个字节数组的示例:

byte[] firstArray = {2,45,79,33};
byte[] secondArray = {55,4,7,81};
byte[] result = firstArray.Concat(secondArray).ToArray();

I recommend the answer found here: How do I concatenate two arrays in C#?我推荐这里的答案: How do I concatenate two array in C#?

eg例如

var z = new int[x.Length + y.Length];
x.CopyTo(z, 0);
y.CopyTo(z, x.Length);

The solution looks like great fun, but it is possible to concatenate arrays in just two statements.该解决方案看起来很有趣,但可以仅在两个语句中连接数组。 When you're handling large byte arrays, I suppose it is inefficient to use a Linked List to contain each byte.当您处理大字节数组时,我认为使用链表来包含每个字节是低效的。

Here is a code sample for reading bytes from a stream and extending a byte array on the fly:这是从流中读取字节并动态扩展字节数组的代码示例:

byte[] buf = new byte[8192];
    byte[] result = new byte[0];
    int count = 0;
    do
    {
        count = resStream.Read(buf, 0, buf.Length);
        if (count != 0)
        {
            Array.Resize(ref result, result.Length + count);
            Array.Copy(buf, 0, result, result.Length - count, count);
        }
    }
    while (count > 0); // any more data to read?
    resStream.Close();

using this we can add two array with out any loop.使用这个我们可以添加两个数组而没有任何循环。

I believe if you have 2 arrays of the same type that you want to combine into one of array, there's a very simple way to do that.我相信如果您有 2 个相同类型的数组想要合并到其中一个数组中,那么有一种非常简单的方法可以做到这一点。

Here's the code:这是代码:

String[] TextFils = Directory.GetFiles(basePath, "*.txt");
String[] ExcelFils = Directory.GetFiles(basePath, "*.xls");
String[] finalArray = TextFils.Concat(ExcelFils).ToArray();

or或者

String[] Fils = Directory.GetFiles(basePath, "*.txt");
String[] ExcelFils = Directory.GetFiles(basePath, "*.xls");
Fils = Fils.Concat(ExcelFils).ToArray();

If you can make an approximation of the number of items that will be there at the end, use the overload of the List constuctor that takes count as a parameter.如果您可以估算最后将出现的项目数,请使用将 count 作为参数的 List 构造函数的重载。 You will save some expensive List duplications.您将节省一些昂贵的列表重复。 Otherwise you have to pay for it.否则你必须为此付出代价。

You might not need to concatenate end result into contiguous array.您可能不需要将最终结果连接到连续数组中。 Instead, keep appending to the list as suggested by Jon.相反,按照 Jon 的建议继续添加到列表中。 In the end you'll have a jagged array (well, almost rectangular in fact).最后你会得到一个锯齿状的数组(嗯,实际上几乎是矩形的)。 When you need to access an element by index, use following indexing scheme:当您需要通过索引访问元素时,请使用以下索引方案:

double x = list[i / sampleSize][i % sampleSize];

Iteration over jagged array is also straightforward:对锯齿状数组的迭代也很简单:

for (int iRow = 0; iRow < list.Length; ++iRow) {
  double[] row = list[iRow];
  for (int iCol = 0; iCol < row.Length; ++iCol) {
    double x = row[iCol];
  }
}

This saves you memory allocation and copying at expense of slightly slower element access.这可以节省内存分配和复制,但代价是元素访问速度稍慢。 Whether this will be a net performance gain depends on size of your data, data access patterns and memory constraints.这是否会带来净性能提升取决于您的数据大小、数据访问模式和内存限制。

Here is a usable class based on what Constantin said:这是一个基于康斯坦丁所说的可用类:

class Program
{
    static void Main(string[] args)
    {
        FastConcat<int> i = new FastConcat<int>();
        i.Add(new int[] { 0, 1, 2, 3, 4 });
        Console.WriteLine(i[0]);
        i.Add(new int[] { 5, 6, 7, 8, 9 });
        Console.WriteLine(i[4]);

        Console.WriteLine("Enumerator:");
        foreach (int val in i)
            Console.WriteLine(val);

        Console.ReadLine();
    }
}

class FastConcat<T> : IEnumerable<T>
{
    LinkedList<T[]> _items = new LinkedList<T[]>();
    int _count;

    public int Count
    {
        get
        {
            return _count;
        }
    }

    public void Add(T[] items)
    {
        if (items == null)
            return;
        if (items.Length == 0)
            return;

        _items.AddLast(items);
        _count += items.Length;
    }

    private T[] GetItemIndex(int realIndex, out int offset)
    {
        offset = 0; // Offset that needs to be applied to realIndex.
        int currentStart = 0; // Current index start.

        foreach (T[] items in _items)
        {
            currentStart += items.Length;
            if (currentStart > realIndex)
                return items;
            offset = currentStart;
        }
        return null;
    }

    public T this[int index]
    {
        get
        {
            int offset;
            T[] i = GetItemIndex(index, out offset);
            return i[index - offset];
        }
        set
        {
            int offset;
            T[] i = GetItemIndex(index, out offset);
            i[index - offset] = value;
        }
    }

    #region IEnumerable<T> Members

    public IEnumerator<T> GetEnumerator()
    {
        foreach (T[] items in _items)
            foreach (T item in items)
                yield return item;
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    #endregion
}

Olmo's suggestion is very good, but I'd add this: If you're not sure about the size, it's better to make it a little bigger than a little smaller. Olmo 的建议很好,但我要补充一点:如果你不确定尺寸,最好把它弄大一点而不是小一点。 When a list is full, keep in mind it will double its size to add more elements.当列表已满时,请记住它会将其大小加倍以添加更多元素。

For example: suppose you will need about 50 elements.例如:假设您需要大约 50 个元素。 If you use a 50 elements size and the final number of elements is 51, you'll end with a 100 sized list with 49 wasted positions.如果您使用 50 个元素的大小并且元素的最终数量为 51,您将得到一个 100 大小的列表,其中有 49 个浪费的位置。

I had the same issue to solve with the requirement of appending a specific count instead of the whole array, and my first solution was the same as suggested by Hugo.我有同样的问题要解决,需要附加一个特定的计数而不是整个数组,我的第一个解决方案与 Hugo 建议的相同。 But my feeling said "inefficient" because of that many resizes.但我的感觉说“效率低下”,因为调整了这么多。

Then I remembered that the StringBuilder is capacity-optimized.然后我记得 StringBuilder 是容量优化的。 As next I asked myself, does it apply to MemoryStream too.接下来我问自己,它是否也适用于 MemoryStream。 After some tries I can say yes it does.经过一些尝试,我可以说是的。

The MemoryStream starts with a minimal capacity of 256 bytes and grows if necessary by the double of its last capacity, like 256, 512, 1024, 2048, 4096, 8192 and so on. MemoryStream 以 256 字节的最小容量开始,并在必要时增长其最后容量的两倍,如 256、512、1024、2048、4096、8192 等。

My next question was, how long it takes to do array resize and copy in contrast to using a MemoryStream.我的下一个问题是,与使用 MemoryStream 相比,调整数组大小和复制需要多长时间。 Using a MemoryStream was much faster instead of array resize and copy.使用 MemoryStream 比调整数组大小和复制要快得多。

Hence, I guess using a MemoryStream is the most efficient way.因此,我想使用 MemoryStream 是最有效的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM