简体   繁体   中英

What to use to store serialized data that can be queried?

I need to extract data from an incoming message that could be in any format. The extracted data to store is also dependent upon the format, ie format A could extract field X, Y, Z, but format B could extract field A, B, C. I also need to view Message B by searching for field C within the message.

Right now I'm configuring and storing a the extraction strategy (XSLT) and executing it at runtime when it's related format is encountered, but I'm storing the extracted data in an Oracle database as an XmlType column. Oracle seems to have pretty lax development/support for XmlType as it requires an old jar that forces you to use a pretty old DOM DocumentBuilderFactory impl (looks like Java 1.4 code), which collides with Spring 3, and doesn't play very nicely with Hibernate. The XML queries are slow and non-intuitive as well.

I'm concluding that Oracle with XmlType isn't a very good way to store the extracted data, so my question is, what is the best way to store the serialized/queryable data?

One alterative that you haven't listed is using an XML Database . (Notice that Oracle is one of the ten or so XML database products.)

(Obviously, a blob type won't allow querying "inside" the persisted XML objects unless you read each blob instance into memory and do the querying there; eg using XSLT.)

I have had great success in storing complex xml objects in PostgreSQL. Together with the functional index features, you can even create indexes on node values of the stored xml files, and use those indexes to do very fast lookups using index scans without having to reparse the XML file.

This however will only work if you know your query patterns, arbitrary xpath queries will be slow also.

Example (untested, contains syntax errors for sure):

Create a simple table:

create table test123 (
    int serial primary key,
    myxml text
)

Now lets assume you have xml documents like:

<test>
    <name>Peter</name>
    <info>Peter is a <i>very</i> good cook</info>
</test>

Now create a function index:

create index idx_test123_name on table123 using xpath(xml,"/test/name");

Now do you fast xml lookups:

SELECT xml FROM test123 WHERE xpath(xml,"/test/name") = 'Peter';

You should also consider creating an index using text_pattern_ops, so you can have fast prefix lookups like:

SELECT xml FROM test123 WHERE xpath(xml,"/test/name") like 'Pe%';

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM