简体   繁体   English

在大范围内解析到阵列VBA的最有效方法

[英]Most efficient way to parse in a large range in to an array VBA

I have a large range of data in excel that I would like to parse into an array for a user defined function. 我在excel中有大量数据,我想将它们解析为用户定义函数的数组。 The range is 2250 x 2250. It takes far too long to parse each cell in via a for loop, and it is too large to be assigned to an array via this method: 范围是2250 x2250。通过for循环解析每个单元格花费的时间太长,而且太大了,无法通过此方法分配给数组:

dim myArr as Variant
myArr = range("myrange")

Just brainstorming here, would it be more efficient to parse in each column and join the arrays? 只是在这里集思广益,在每一列中进行解析并加入数组会更有效吗? Any ideas? 有任何想法吗?

Thanks 谢谢

You're nearly there. 你快到了

The code you need is: 您需要的代码是:

Dim myArr as Variant
myArr = range("myrange").Value2

Note that I'm using the .Value2 property of the range, not just 'Value', which reads formats and locale settings, and will probably mangle any dates 请注意,我正在使用范围的.Value2属性,而不仅仅是“ Value”,它读取格式和语言环境设置,并且可能会弄乱任何日期

Note, also, that I haven't bothered to Redim and specify the dimensions of the array: the Value and Value2 properties are a 2-dimensional array, (1 to Rowcount, 1 to Col Count)... Unless it's a single cell, which will be a scalar variant which breaks any downstream code that expected an array. 还要注意,我也没有费心Redim并指定数组的尺寸:Value和Value2属性是二维数组,(1到行数,1到列数)...除非它是单个单元格,这将是一个标量变体,它将破坏期望数组的所有下游代码。 But that's not your problem with a known 2250 x 2250 range. 但这不是已知2250 x 2250范围的问题。

If you reverse the operation, and write an array back to a range, you will need to set the size of the receiving range exactly to the dimensions of the array. 如果您反转操作,并将数组写回到某个范围,则需要将接收范围的大小完全设置为数组的尺寸。 Again, not your problem with the question you asked: but the two operations generally go together. 同样,您提出的问题也不是您的问题:但是通常这两个操作会同时进行。

The general principle is that each 'hit' to the worksheet takes about a twentieth of a second - some machines are much faster, but they all have bad days - and the 'hit' or reading a single cell to a variable is almost exactly the same as reading a seven-million-cell range into a variant array. 一般原则是,每次对工作表的“命中”都需要大约二十秒的时间-有些机器速度更快,但是它们的日子都不好过-“命中”或将单个单元格读取到变量几乎完全是就像将700万个单元的范围读入一个变体数组一样。 Both are several million times faster than reading that range in one cell at a time. 两者都是在同一时刻读取范围在一个小区快好次。

Either way, you may as well count any operation in VBA as happening in zero time once you've done the 'read-in' and stopped interacting with the worksheet. 无论哪种方式,一旦完成“读入”并停止与工作表的交互,您最好将VBA中的任何操作都计为在零时间内发生。

The numbers are all very rough-and-ready, but the general principles will hold, right up until the moment you start allocating arrays that won't fit in the working memory and, again, that's not your problem today. 这些数字都是非常粗糙的,但是一般的原则会一直保持下去,直到您开始分配不适合工作内存的数组之时,同样,这也不是今天的问题。

Remember to Erase the array variant when you've finished, rather than relying on it going out of scope: that'll make a difference, with a range this size. 记住,完成后要Erase数组变量,而不要依赖它超出范围:在这种大小的范围内,会有所作为。

This works fine. 这很好。

Sub T()
    Dim A() As Variant

    A = Range("A2").Resize(2250, 2250).Value2

    Dim i As Long, j As Long
    For i = 1 To 2250
        For j = 1 To 2250
            If i = j Then A(i, j) = 1
        Next j
    Next i

    Range("A2").Resize(2250, 2250).Value2 = A
End Sub

I think the best options are: 我认为最好的选择是:

  1. Try to limit the data to a reasonable number, say 1,000,000 values at a time. 尝试将数据限制为合理的数量,一次说1,000,000个值。
  2. Add some error handling to catch the Out of Memory error and then try again, but cut the size in half, then by a third, a quarter, etc...until it works. 添加一些错误处理以捕获Out of Memory错误,然后重试,但是将大小减小一半,然后减小三分之一,四分之一,依此类推...直到工作。

Either way, if we're using data sets in the order of 5,000,000 values and you want to make sure that the program will run, you will need to adjust the code to chop up the data. 无论哪种方式,如果我们使用的是5,000,000个值的数据集,并且您要确保程序能够运行,则需要调整代码以将数据切碎。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM