简体   繁体   中英

How can I perform a binary search on an Excel range to find the last non-null cell?

I have an Excel (.xlsx) file that only has data in the first column. My goal is to find the last row with data in it. Right now, I'm checking each cell one by one (starting from the first row) to see if it's null. However, Excel lookups are fairly expensive—it's taking more than ten seconds to find the last row when there are ~10,000 data points.

I'd like to do a balanced binary search instead of a linear search. Assume that there will never be more than 100,000 rows, but let's look at a smaller example that assumes a maximum of 15 rows. 平衡二叉搜索树

Suppose the last row is 11. Then the search path would look like this:

Row 8 = filled, next search = 12
Row 12 = null, next search = 10
Row 10 = filled, next search = 11
Row 11 = child node, last data row found.

This requires 4 Excel reads (3 if you don't include 11 since it's a node value) versus 11 if the search had been linear.

Here's the same thing for 4.

Row 8 = null, next search = 4
Row 4 = filled, next search = 6
Row 6 = null, next search = 5
Row 5 = null & child node, last data row must be 4.

This requires 4 Excel reads either way. However, at a larger scale with a maximum row of 100,000, the binary search would have a much better average execution time.

Can someone help me with an implementation for this kind of search in C#?

I found this question Trying to find the last non-empty cell in a specific row/range over multiple sheets , but I'm interested in figuring out this algorithm in C#, not in using Excel formulas.

Here's the syntax for getting a cell's value in Microsoft.Office.Excel.Interop:

string value = myWorksheet.Cells[3, 4].Text; // row 3, column 4

If you know the last possible row, you can run a binary search on your excel data, like this:

var first = 0;
var last = 10000;
while (first+1 < last) {
    var mid = (first+last)/2;
    if (string.IsNullOrEmpty(myWorksheet.Cells[mid, 1].Text)) {
        last = mid;
    } else {
        first = mid;
    }
}

Demo.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM