简体   繁体   English

如何从Excel电子表格中读取单个列?

[英]How do I read in a single column from an Excel spreadsheet?

I'm trying to read a single column from an Excel document. 我正在尝试从Excel文档中读取单个列。 I'd like to read the entire column, but obviously only store the cells that have data. 我想阅读整个专栏,但显然只存储有数据的单元格。 I also would like to try and handle the case, where a cell(s) in the column are empty, but it will read in later cell values if there's something farther down in the column. 我也想尝试处理这种情况,其中列中的单元格是空的,但如果列中有更深的东西,它将读入稍后的单元格值。 For example: 例如:

| Column1 |
|---------|
|bob      |
|tom      |
|randy    |
|travis   |
|joe      |
|         |
|jennifer |
|sam      |
|debby    |

If I had that column, I don't mind having a value of "" for the row after joe , but I do want it to keep getting values after the blank cell. 如果我有那个列,我不介意为joe之后的行设置值"" ,但我确实希望它在空白单元格之后继续获取值。 However, I do not want it to go on for 35,000 lines past debby assuming debby is the last value in the column. 但是,如果debby是列中的最后一个值,我不希望它在debby继续35,000行。

It is also safe to assume that this will always be the first column. 假设这始终是第一列也是安全的。

So far, I have this: 到目前为止,我有这个:

Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);

foreach (Excel.Range r in myRange)
{
    MessageBox.Show(r.Text);
}

I've found lots of examples from older versions of .NET that do similar things, but not exactly this, and wanted to make sure I did something that's more modern (assuming the method one would use to do this has changed some amount). 我发现很多来自早期版本的.NET的例子做了类似的事情,但不完全是这样,并且想确保我做了一些更现代的事情(假设一个用来做这个的方法已经改变了一些)。

My current code reads the entire column, but includes blank cells after the last value. 我当前的代码读取整个列,但在最后一个值后包含空白单元格。


EDIT1 EDIT1

I liked Isedlacek's answer below, but I do have a problem with it, that I'm not certain is specific to his code. 我喜欢下面的Isedlacek答案,但我确实遇到了问题,我不确定他的代码是否具体。 If I use it in this way: 如果我以这种方式使用它:

Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);

var nonEmptyRanges = myRange.Cast<Excel.Range>()
.Where(r => !string.IsNullOrEmpty(r.Text));

foreach (var r in nonEmptyRanges)
{
    MessageBox.Show(r.Text);
}

MessageBox.Show("Finished!");

the Finished! Finished! MessageBox never shows. MessageBox永远不会显示。 I'm not sure why that happens, but it appears to never actually finish searching. 我不确定为什么会这样,但似乎从未真正完成搜索。 I tried adding a counter to the loop to see if it was just continuously searching through the column, but it doesn't appear to be ... it appears to just stop. 我尝试在循环中添加一个计数器,看看它是否只是不断搜索列,但它似乎不是......它似乎只是停止。

Where the Finished! Finished! MessageBox is, I tried to just close the workbook and spreadsheet, but that code never ran (as expected, since the MessageBox never ran). MessageBox是,我试图关闭工作簿和电子表格,但该代码从未运行(正如预期的那样,因为MessageBox从未运行过)。

If I close the Excel spreadsheet manually, I get a COMException: 如果我手动关闭Excel电子表格,我会收到COMException:

COMException was unhandled by user code 用户代码未处理COMException
Additional information: Exception from HRESULT: 0x803A09A2 附加信息:HRESULT的异常:0x803A09A2

Any ideas? 有任何想法吗?

The answer depends on whether you want to get the bounding range of the used cells or if you want to get the non-null values from a column. 答案取决于您是否要获取已使用单元格的边界范围,或者是否要从列中获取非空值。

Here's how you can efficiently get the non-null values from a column. 以下是如何有效地从列中获取非空值。 Note that reading in the entire tempRange.Value property at once is MUCH faster than reading cell-by-cell, but the tradeoff is that the resulting array can use up much memory. 请注意,阅读在整个tempRange.Value一次属性要比阅读细胞通过细胞快,但代价是导致阵列最多可使用多少内存。

private static IEnumerable<object> GetNonNullValuesInColumn(_Application application, _Worksheet worksheet, string columnName)
{
    // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
    var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);

    // if there is no intersection, there are no values in the column
    if (tempRange == null)
        yield break;

    // get complete set of values from the temp range (potentially memory-intensive)
    var value = tempRange.Value2;

    // if value is NULL, it's a single cell with no value
    if (value == null)
        yield break;

    // if value is not an array, the temp range was a single cell with a value
    if (!(value is Array))
    {
        yield return value;
        yield break;
    }

    // otherwise, the value is a 2-D array
    var value2 = (object[,]) value;
    var rowCount = value2.GetLength(0);
    for (var row = 1; row <= rowCount; ++row)
    {
        var v = value2[row, 1];
        if (v != null)
            yield return v;
    }
}

Here's an efficient way to get the minimum range that contains the non-empty cells in a column. 这是获得包含列中非空单元格的最小范围的有效方法。 Note that I am still reading the entire set of tempRange values at once, and then I use the resulting array (if multi-cell range) to determine which cells contain the first and last values. 请注意,我仍在一次读取整个tempRange值,然后使用结果数组(如果是多单元格范围)来确定哪些单元格包含第一个和最后一个值。 Then I construct the bounding range after having figured out which rows have data. 然后我在弄清楚哪些行有数据之后构造了边界范围。

private static Range GetNonEmptyRangeInColumn(_Application application, _Worksheet worksheet, string columnName)
{
    // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
    var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);

    // if there is no intersection, there are no values in the column
    if (tempRange == null)
        return null;

    // get complete set of values from the temp range (potentially memory-intensive)
    var value = tempRange.Value2;

    // if value is NULL, it's a single cell with no value
    if (value == null)
        return null;

    // if value is not an array, the temp range was a single cell with a value
    if (!(value is Array))
        return tempRange;

    // otherwise, the temp range is a 2D array which may have leading or trailing empty cells
    var value2 = (object[,]) value;

    // get the first and last rows that contain values
    var rowCount = value2.GetLength(0);
    int firstRowIndex;
    for (firstRowIndex = 1; firstRowIndex <= rowCount; ++firstRowIndex)
    {
        if (value2[firstRowIndex, 1] != null)
            break;
    }
    int lastRowIndex;
    for (lastRowIndex = rowCount; lastRowIndex >= firstRowIndex; --lastRowIndex)
    {
        if (value2[lastRowIndex, 1] != null)
            break;
    }

    // if there are no first and last used row, there is no used range in the column
    if (firstRowIndex > lastRowIndex)
        return null;

    // return the range
    return worksheet.Range[tempRange[firstRowIndex, 1], tempRange[lastRowIndex, 1]];
}

If you don't mind losing the empty rows completely: 如果你不介意完全丢失空行:

var nonEmptyRanges = myRange.Cast<Excel.Range>()
    .Where(r => !string.IsNullOrEmpty(r.Text))
foreach (var r in nonEmptyRanges)
{
    // handle the r
    MessageBox.Show(r.Text);
}
    /// <summary>
    /// Generic method which reads a column from the <paramref name="workSheetToReadFrom"/> sheet provided.<para />
    /// The <paramref name="dumpVariable"/> is the variable upon which the column to be read is going to be dumped.<para />
    /// The <paramref name="workSheetToReadFrom"/> is the sheet from which te column is going to be read.<para />
    /// The <paramref name="initialCellRowIndex"/>, <paramref name="finalCellRowIndex"/> and <paramref name="columnIndex"/> specify the length of the list to be read and the concrete column of the file from which to perform the reading. <para />
    /// Note that the type of data which is going to be read needs to be specified as a generic type argument.The method constraints the generic type arguments which can be passed to it to the types which implement the IConvertible interface provided by the framework (e.g. int, double, string, etc.).
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="dumpVariable"></param>
    /// <param name="workSheetToReadFrom"></param>
    /// <param name="initialCellRowIndex"></param>
    /// <param name="finalCellRowIndex"></param>
    /// <param name="columnIndex"></param>
    static void ReadExcelColumn<T>(ref List<T> dumpVariable, Excel._Worksheet workSheetToReadFrom, int initialCellRowIndex, int finalCellRowIndex, int columnIndex) where T: IConvertible
    {
        dumpVariable = ((object[,])workSheetToReadFrom.Range[workSheetToReadFrom.Cells[initialCellRowIndex, columnIndex], workSheetToReadFrom.Cells[finalCellRowIndex, columnIndex]].Value2).Cast<object>().ToList().ConvertAll(e => (T)Convert.ChangeType(e, typeof(T)));
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Open XML读取Excel电子表格时,如何确定表格所在的工作表? - Using Open XML to read an Excel spreadsheet, how do I determine the sheet that a Table is on? 您如何根据列值从 excel 电子表格中 select 行并将其插入 dataGridViews? - How do you select rows from an excel spreadsheet based off of a column value and insert it into dataGridViews? 如何从excel电子表格的列中找到linqtoexcel的最大值? - How to find max value by linqtoexcel from the column(s) of excel spreadsheet? 如何在 Excel 电子表格上从 C# 更改工作表名称 - How do I Change the Sheet Name from C# on an Excel Spreadsheet 如何将C#对象列表导出到Excel电子表格? - How do I export a C# object list to an excel spreadsheet? 如何按名称获取和比较Excel电子表格中单元格的内容? - How do I get and compare the contents of a cell in an Excel spreadsheet by name? 如何以编程方式将XML Map添加到Excel 2010电子表格中? - How do I add an XML Map programmatically to an Excel 2010 Spreadsheet? 如何使用Windows应用程序驱动程序测试Excel电子表格 - How do I test an excel spreadsheet with WIndows Application driver 使用OpenXML如何在Excel电子表格中获取现有样式的StyleIndex - With OpenXML how do I get the StyleIndex of an existing style in an excel spreadsheet 如何从 Excel 电子表格创建一个 .xslt 文件? - How do you create an .xslt file from an Excel spreadsheet?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM