简体   繁体   English

Oracle SQL-从XML标记中选择一个日期并将其转换为DATE类型以允许比较日期

[英]Oracle SQL - Selecting a date from an XML tag and casting it to a DATE type to allow comparison of the dates

I have a CLOB type column (called xml) that stores an xml document. 我有一个存储XML文档的CLOB类型列(称为xml)。 The xml document has a tag called <productionDate> . xml文档具有名为<productionDate>的标记。

What i want to do is search the table and retrieve any row that contains an xml document that has productionDate between two dates. 我想做的是搜索表并检索包含两个日期之间具有productionDate的xml文档的任何行。

I know how to read the xml tag as a value: 我知道如何读取xml标记作为值:

Select 
xmltype(xml).extract('//product/productionDate/text()').getStringVal() from myTable

The above query returns the following date. 上面的查询返回以下日期。

1999-09-23 00:00:00.0 UTC
2000-01-18 00:00:00.0 UTC
2000-01-18 00:00:00.0 UTC
1999-11-02 00:00:00.0 UTC
1999-11-02 00:00:00.0 UTC
1999-11-02 00:00:00.0 UTC
1999-11-02 00:00:00.0 UTC
1999-11-02 00:00:00.0 UTC
1999-11-02 00:00:00.0 UTC

I want to only list the ones that are between 01-NOV-1999 and and 01-JAN-2000. 我只列出1999年11月1日至2000年1月1日之间的内容。 To do this i tried to cast the value to a DATE type 为此,我尝试将值转换为DATE类型

Select 
XMLCast(xmltype(xml).extract('//product/productionDate/text()').getStringVal() as DATE) from myTable

I get the following error: 我收到以下错误:

ORA-01830: date format picture ends before converting entire input string

Presumably this is because the format of the date cannot be converted to a date format. 大概是因为日期格式无法转换为日期格式。 What kind i do to read the date as a valid date (Ignoring the time and the text UTC) so that i can do a comparison on the date. 我该怎么做才能将日期读取为有效日期(忽略时间和UTC文本),以便可以对日期进行比较。

Maybe a substring might work but i want to hear what other more efficient options are available out there. 也许一个子串可能有效,但是我想听听那里还有哪些其他更有效的选择。

You're getting the value as a string, so you should just be able to use use to_date or to_timestamp : 您以字符串形式获取值,因此您应该只能使用to_dateto_timestamp

Select 
  to_timestamp(xmltype(xml).extract('//product/productionDate/text()').getStringVal(),
    'YYYY-MM-DD HH24:MI:SS.FF "UTC"')
from myTable

SQL Fiddle demo . SQL Fiddle演示

You could move it to a different time zone at the same time if you needed to. 如果需要,可以同时将其移至其他时区。

Of course, given the date format you have, you could compare as strings and skip the conversion... 当然,给定日期格式,您可以将其作为字符串进行比较并跳过转换...

Poor-Man's Method for Optimizing Queries for File-Based Data Types 用于基于文件的数据类型的查询优化的穷人方法

There are a number of ways to optimize storage and retrieval of file-based data types (CLOB/XML). 有多种方法可以优化基于文件的数据类型(CLOB / XML)的存储和检索。 If you're anchored to a relational database solution such as Oracle, here's some simple suggestions that use out-of-the-box functionality without having to buy, license or install additional add-ons and products. 如果您使用Oracle等关系数据库解决方案,这里有一些简单的建议,这些建议使用即用型功能,而无需购买,许可或安装其他附加组件和产品。

Initial Thoughts: Thinking About Traffic and Usage 初步思路:有关流量和使用的思考

It may help you to consider how often you need to dive into the XML files to extract their properties. 它可以帮助您考虑需要多长时间探查XML文件以提取其属性。 It would also be important to consider how often the contents of these XML files change once they have been added to your CLOB-based table. 考虑这些XML文件的内容一旦添加到基于CLOB的表中后,更改的频率也很重要。

If the DML traffic involving your CLOB typed table isn't too heavy, consider building an assisting object which has the types of hooks that will help you find the right XML files when you need it. 如果涉及CLOB类型表的DML通信量不太高,请考虑构建一个辅助对象,该对象的钩子类型将帮助您在需要时找到正确的XML文件。 Once you have it, you can divert query activity away from your source table except for specific Primary Key based inquiries. 有了它之后,就可以将查询活动从源表转移到特定的基于主键的查询之外。

Inline Function Conversions and Their Optimization 内联函数转换及其优化

That being said, you could apply conversion functions on the fly and transform the date value every time you query it. 话虽如此,您可以即时应用转换函数并在每次查询时转换日期值。 There are even add-ons for Oracle that optimizes the database for querying and searching for values within LOB objects.... 甚至还有用于Oracle的加载项,它们可以优化数据库以查询和搜索LOB对象中的值。

There is even an optimization convention called a FUNCTION BASED INDEX , which prompts Oracle to use an alternate reference (index) that also uses the TO_DATE/SUBSTR function combinations you would need to convert the value in your XML document. 甚至还有一个称为FUNCTION BASED INDEX的优化约定,它提示Oracle使用备用引用(索引),该备用引用也使用TO_DATE / SUBSTR函数组合,您需要在XML文档中转换值。

There are alternate methods to INLINE conversions, which are probably a lot better because an INLINE conversion is applied every time the query is called. 对于INLINE转换,还有其他方法可能会好得多,因为每次调用查询时都会应用INLINE转换。 In general, no conversion functions and data in their native format (ie, date as date, datetime or timestamp) run much faster and with less cost to existing database resources. 通常,没有转换功能和原始格式的数据(即日期作为日期,日期时间或时间戳记)以更快的速度运行,而现有数据库资源的成本却更低。

Externally Tagging Your XML Documents 从外部标记您的XML文档

Assuming that the CLOB table with your XML files also has a primary key column, two ideas come to mind: 假设带有XML文件的CLOB表也有一个主键列,那么会想到两个想法:

  1. Create a second table with the converted values (such as PRODUCTION_DATE ), identify each date record with the PK ID of the CLOB that it came from. 使用转换后的值(例如PRODUCTION_DATE )创建第二个表,并使用其来源的CLOB的PK ID标识每个日期记录。 This table can be extended as you discover new attributes that are frequently accessed by your queries. 当您发现查询经常访问的新属性时,可以扩展该表。 Manage the second table by placing a trigger on the CLOB table. 通过在CLOB表上放置触发器来管理第二个表。

    Anytime a clob record is added, the data value is extracted and converted exactly once. 每当添加Clob记录时,数据值就被提取并转换一次。 If anything queries for a clob record that has not changed, there should be no need to extract and convert the value since the last time it was queried. 如果有任何查询未更改的Clob记录,则自上次查询以来,就无需提取和转换该值。

  2. Create a Materialized View containing the Extraction and Conversion function so that the MView column for PRODUCTION_DATE can be defined as a real DATE type. 创建包含提取和转换功能的实例化视图,以便可以将PRODUCTION_DATE的MView列定义为实际的DATE类型。 MViews also work like Idea (1), but a little more elegantly. MView也可以像Idea(1)一样工作,但是要优雅一些。

    If it's possible a FAST refreshable view would work well as it manages your MView automatically and in near real-time response to changes in your CLOB table. 如果可能,FAST可刷新视图将很好地工作,因为它可以自动管理MView并几乎实时响应CLOB表中的更改。 A FAST refreshable view only updates the MView with changes and additions, so even if your source table is huge, daily operations are based on incremental effects only. FAST可刷新视图仅使用更改和添加来更新MView,因此,即使源表很大,日常操作也仅基于增量效果。

Not knowing the volume, usage or stats on your data a COMPLETE refresh type MView may or may not be possible or efficient without additional compromises or assumptions. 不了解数据的数量,使用情况或状态,如果没有其他折衷或假设,完全刷新类型MView可能会或可能不会有效。 Sometimes MView query definitions are too complex to qualify for the FAST refreshable format. 有时,MView查询定义太复杂而无法使用FAST可刷新格式。

In either case, you end up with a second object (MVIEW or TABLE) that contains a DATE formatted value for PRODUCTION_DATE . 无论哪种情况,您最终都会得到另一个对象(MVIEW或TABLE),该对象包含PRODUCTION_DATE的DATE格式值。 Query this TAG table instead and use the associated PK value to identify which record in your CLOB table should be accessed once you've narrowed down the set or individual record that meets the query criteria. 而是查询此TAG表,并使用关联的PK值来确定在缩小满足查询条件的集合或单个记录后应访问CLOB表中的哪个记录。

Parting Thoughts (Optional... sort of) 离别的想法(可选...有点)

If it's possible to have a NULL PRODUCTION_DATE value, it would be tempting to simply apply a NVL() inline function when querying the TAG supporting table. 如果可能具有NULL PRODUCTION_DATE值,那么在查询TAG支持表时仅应用NVL()内联函数将很诱人。 Depending on the volume of data in question, this probably shouldn't be a problem. 根据所讨论的数据量,这可能不应该成为问题。 Ideally, you'll want to always have a value in there... and a NOT NULL constraint on that DATE column... those kinds of things help the db optimizer make better assumptions about how to dive into the tables... especially if there are lots and lots of records in there. 理想情况下,您将始终希望在那里拥有一个值...并在该DATE列上具有NOT NULL约束...这些事情可以帮助db优化器对如何深入表进行更好的假设...特别是如果那里有很多记录。

In all, you also could use an XMLTYPE data type, which actually is optimized for handling XML. 总之,您还可以使用XMLTYPE数据类型,该数据类型实际上是为处理XML而优化的。 And it will outperform more alternative solutions than you would think of. 而且它将比您想象的要胜过更多的替代解决方案。 No poor man's solutions needed (if not only, also in 8i/9.1/9.2 database versions this would be a bad "solution"). 不需要穷人的解决方案(如果不仅如此,在8i / 9.1 / 9.2数据库版本中,这也是一个糟糕的“解决方案”)。 That said, XML(DB) is a "no cost option" that can freely be used / no extra license needed. 也就是说,XML(DB)是可以免费使用的“无成本选择” /不需要额外的许可证。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM