简体   繁体   English

OpenXML SDK2.5(Excel):如何确定单元格是否包含数字值?

[英]OpenXML SDK2.5 (Excel): How to determine if a cell contains a numeric value?

I am busy developing a component which imports data from a MS Excel (2016) file. 我正在忙于开发一个从MS Excel(2016)文件导入数据的组件。 This component uses the MS OpenXML SDK2.5 library. 该组件使用MS OpenXML SDK2.5库。 The end-users installation of MS Excel is based on Dutch country / region settings. MS Excel的最终用户安装基于荷兰的国家/地区设置。 The file contains, among others, a column with financial data (numeric). 该文件除其他外,包含带有财务数据(数字)的列。 The position of this column is not known in advance. 该列的位置事先未知。

To determine if a cell contains numeric data I evaluate the property Cell.DataType (of type CellValues, which is an enum). 为了确定一个单元格是否包含数字数据,我评估了Cell.DataType属性(CellValues类型,它是一个枚举)。 At first it seems that this property is the perfect candidate to determine this. 起初,似乎该属性是确定此属性的理想选择。 Possible values of CellValues are: Boolean, Number, Error, SharedString, String, InlineString or Date. CellValues的可能值为:布尔值,数字,错误,SharedString,String,InlineString或Date。 So I would expect that Cell.DataType is set to CellValues.Number. 因此,我希望将Cell.DataType设置为CellValues.Number。 After some debugging I found out that Cell.DataType is null when the cell contains numeric data. 经过一些调试后,我发现当单元格包含数字数据时Cell.DataType为null。

While searching on internet to find an explanation I found the following MSDN article: https://msdn.microsoft.com/en-us/library/office/hh298534.aspx 在Internet上搜索以查找说明时,我发现了以下MSDN文章: https : //msdn.microsoft.com/en-us/library/office/hh298534.aspx

The article describes exactly what I found during debugging: 本文正是描述了我在调试过程中发现的内容:

The Cell type provides a DataType property that indicates the type of the data within the cell. 单元格类型提供DataType属性,该属性指示单元格内数据的类型。 The value of the DataType property is null for numeric and date types. 对于数字和日期类型,DataType属性的值为null。

Does anybody know why Cell.DataType is not initialized with respectively CellValues.Number or CellValues.Date? 有人知道为什么不分别用CellValues.Number或CellValues.Date初始化Cell.DataType吗?

What is the best way to determine if a cell contains a numeric value? 确定单元格是否包含数字的最佳方法是什么?

Does anybody know why Cell.DataType is not initialized with respectively CellValues.Number or CellValues.Date? 有人知道为什么不分别用CellValues.Number或CellValues.Date初始化Cell.DataType吗?

Looking at the ECMA-376 standard from here , the (abbreviated) XSD for a Cell looks like this: 这里查看 ECMA-376标准, Cell的(缩写)XSD如下所示:

<xsd:complexType name="CT_Cell">
    ...
    <xsd:attribute name="t" type="ST_CellType" use="optional" default="n"/>
    ...
</xsd:complexType>

That attribute represents the type. 该属性表示类型。 Note that it is optional with a default value of "n" . 注意,它是可选的,默认值为"n" Section 18.18.11 ST_CellType (Cell Type) lists the valid values for the type which are: 第18.18.11节ST_CellType(单元格类型)列出了该类型的有效值,这些值是:

b - boolean b-布尔值
d - date d-日期
e - error 电子错误
inlineStr - an inline string inlineStr-内联字符串
n - number (the default) n-数字(默认)
s - a shared string str - a formula string s-共享字符串str-公式字符串

You can see that "n" represents a number . 您可以看到"n"代表一个number

What is the best way to determine if a cell contains a numeric value? 确定单元格是否包含数字的最佳方法是什么?

It would seem from the above that you could check for a null Cell.DataType or a Cell.DataType of CellValues.Number to tell if a cell contains a number but it's not quite that simple - the big problem is dates. 从上面看来,您可以检查一个空的Cell.DataTypeCellValues.NumberCell.DataType来判断一个单元格是否包含一个数字,但这不是那么简单-最大的问题是日期。

It would seem that the original storage mechanism for dates was to use a number and rely on the style to know whether or not the number is actually a number or if the number represents a date. 似乎日期的原始存储机制是使用数字并依靠样式来确定数字是否实际上是数字或数字是否代表日期。

Confusingly, the spec has been updated to include the Date type but not all dates will use the date type . 令人困惑的是,规范已更新为包括Date类型,但并非所有日期都将使用date类型 The Date type means the cell contains a date in ISO 8601 format but it's perfectly valid for a date to be stored as a number with the correct style. Date类型表示单元格包含ISO 8601格式的日期,但是对于将日期存储为具有正确样式的数字是完全有效的。 The following XML snippet for example shows the same date (1st Feb 2017) in both Number and Date format: 以下XML代码段示例以NumberDate格式显示相同的日期(2017年2月1日):

<sheetData>
    <row r="1" spans="1:1" x14ac:dyDescent="0.25">
        <c r="A1" s="1">
            <v>42767</v>
        </c>
    </row>
    <row r="2" spans="1:1" x14ac:dyDescent="0.25">
        <c r="A2" s="1" t="d">
            <v>2017-02-01</v>
        </c>
    </row>
</sheetData>

Which looks like this when opened in Excel: 在Excel中打开时如下图所示:

生成的Excel文件

If you need to differentiate between dates and numbers then you will need to find any numbers (null Cell.DataType or a Cell.DataType of CellValues.Number ) and then check the style of those cells to ensure they are numbers and not dates disguised as numbers. 如果您需要区分日期和数字,则需要查找任何数字(null Cell.DataTypeCellValues.NumberCell.DataType ),然后检查这些单元格的样式以确保它们是数字,而不是伪装成日期的日期数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM