繁体   English   中英

如何从Excel电子表格中读取单个列?

[英]How do I read in a single column from an Excel spreadsheet?

我正在尝试从Excel文档中读取单个列。 我想阅读整个专栏,但显然只存储有数据的单元格。 我也想尝试处理这种情况,其中列中的单元格是空的,但如果列中有更深的东西,它将读入稍后的单元格值。 例如:

| Column1 |
|---------|
|bob      |
|tom      |
|randy    |
|travis   |
|joe      |
|         |
|jennifer |
|sam      |
|debby    |

如果我有那个列,我不介意为joe之后的行设置值"" ,但我确实希望它在空白单元格之后继续获取值。 但是,如果debby是列中的最后一个值,我不希望它在debby继续35,000行。

假设这始终是第一列也是安全的。

到目前为止,我有这个:

Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);

foreach (Excel.Range r in myRange)
{
    MessageBox.Show(r.Text);
}

我发现很多来自早期版本的.NET的例子做了类似的事情,但不完全是这样,并且想确保我做了一些更现代的事情(假设一个用来做这个的方法已经改变了一些)。

我当前的代码读取整个列,但在最后一个值后包含空白单元格。


EDIT1

我喜欢下面的Isedlacek答案,但我确实遇到了问题,我不确定他的代码是否具体。 如果我以这种方式使用它:

Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);

var nonEmptyRanges = myRange.Cast<Excel.Range>()
.Where(r => !string.IsNullOrEmpty(r.Text));

foreach (var r in nonEmptyRanges)
{
    MessageBox.Show(r.Text);
}

MessageBox.Show("Finished!");

Finished! MessageBox永远不会显示。 我不确定为什么会这样,但似乎从未真正完成搜索。 我尝试在循环中添加一个计数器,看看它是否只是不断搜索列,但它似乎不是......它似乎只是停止。

Finished! MessageBox是,我试图关闭工作簿和电子表格,但该代码从未运行(正如预期的那样,因为MessageBox从未运行过)。

如果我手动关闭Excel电子表格,我会收到COMException:

用户代码未处理COMException
附加信息:HRESULT的异常:0x803A09A2

有任何想法吗?

答案取决于您是否要获取已使用单元格的边界范围,或者是否要从列中获取非空值。

以下是如何有效地从列中获取非空值。 请注意,阅读在整个tempRange.Value一次属性要比阅读细胞通过细胞快,但代价是导致阵列最多可使用多少内存。

private static IEnumerable<object> GetNonNullValuesInColumn(_Application application, _Worksheet worksheet, string columnName)
{
    // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
    var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);

    // if there is no intersection, there are no values in the column
    if (tempRange == null)
        yield break;

    // get complete set of values from the temp range (potentially memory-intensive)
    var value = tempRange.Value2;

    // if value is NULL, it's a single cell with no value
    if (value == null)
        yield break;

    // if value is not an array, the temp range was a single cell with a value
    if (!(value is Array))
    {
        yield return value;
        yield break;
    }

    // otherwise, the value is a 2-D array
    var value2 = (object[,]) value;
    var rowCount = value2.GetLength(0);
    for (var row = 1; row <= rowCount; ++row)
    {
        var v = value2[row, 1];
        if (v != null)
            yield return v;
    }
}

这是获得包含列中非空单元格的最小范围的有效方法。 请注意,我仍在一次读取整个tempRange值,然后使用结果数组(如果是多单元格范围)来确定哪些单元格包含第一个和最后一个值。 然后我在弄清楚哪些行有数据之后构造了边界范围。

private static Range GetNonEmptyRangeInColumn(_Application application, _Worksheet worksheet, string columnName)
{
    // get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
    var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);

    // if there is no intersection, there are no values in the column
    if (tempRange == null)
        return null;

    // get complete set of values from the temp range (potentially memory-intensive)
    var value = tempRange.Value2;

    // if value is NULL, it's a single cell with no value
    if (value == null)
        return null;

    // if value is not an array, the temp range was a single cell with a value
    if (!(value is Array))
        return tempRange;

    // otherwise, the temp range is a 2D array which may have leading or trailing empty cells
    var value2 = (object[,]) value;

    // get the first and last rows that contain values
    var rowCount = value2.GetLength(0);
    int firstRowIndex;
    for (firstRowIndex = 1; firstRowIndex <= rowCount; ++firstRowIndex)
    {
        if (value2[firstRowIndex, 1] != null)
            break;
    }
    int lastRowIndex;
    for (lastRowIndex = rowCount; lastRowIndex >= firstRowIndex; --lastRowIndex)
    {
        if (value2[lastRowIndex, 1] != null)
            break;
    }

    // if there are no first and last used row, there is no used range in the column
    if (firstRowIndex > lastRowIndex)
        return null;

    // return the range
    return worksheet.Range[tempRange[firstRowIndex, 1], tempRange[lastRowIndex, 1]];
}

如果你不介意完全丢失空行:

var nonEmptyRanges = myRange.Cast<Excel.Range>()
    .Where(r => !string.IsNullOrEmpty(r.Text))
foreach (var r in nonEmptyRanges)
{
    // handle the r
    MessageBox.Show(r.Text);
}
    /// <summary>
    /// Generic method which reads a column from the <paramref name="workSheetToReadFrom"/> sheet provided.<para />
    /// The <paramref name="dumpVariable"/> is the variable upon which the column to be read is going to be dumped.<para />
    /// The <paramref name="workSheetToReadFrom"/> is the sheet from which te column is going to be read.<para />
    /// The <paramref name="initialCellRowIndex"/>, <paramref name="finalCellRowIndex"/> and <paramref name="columnIndex"/> specify the length of the list to be read and the concrete column of the file from which to perform the reading. <para />
    /// Note that the type of data which is going to be read needs to be specified as a generic type argument.The method constraints the generic type arguments which can be passed to it to the types which implement the IConvertible interface provided by the framework (e.g. int, double, string, etc.).
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="dumpVariable"></param>
    /// <param name="workSheetToReadFrom"></param>
    /// <param name="initialCellRowIndex"></param>
    /// <param name="finalCellRowIndex"></param>
    /// <param name="columnIndex"></param>
    static void ReadExcelColumn<T>(ref List<T> dumpVariable, Excel._Worksheet workSheetToReadFrom, int initialCellRowIndex, int finalCellRowIndex, int columnIndex) where T: IConvertible
    {
        dumpVariable = ((object[,])workSheetToReadFrom.Range[workSheetToReadFrom.Cells[initialCellRowIndex, columnIndex], workSheetToReadFrom.Cells[finalCellRowIndex, columnIndex]].Value2).Cast<object>().ToList().ConvertAll(e => (T)Convert.ChangeType(e, typeof(T)));
    }

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM