简体   繁体   English

exists-db update insert非常慢

[英]exist-db update insert very slow

I am a beginner with exist-db. 我是exists-db的初学者。 I am building an xml document through Java. 我正在通过Java构建一个xml文档。 I process data through JAXB and then insert into exist-db resource through insert update. 我通过JAXB处理数据,然后通过插入更新插入到exists-db资源。 I am testing with around 500 nodes at this time and it starts taking up to 10 seconds per insert after a few dozen have executed. 我目前正在测试大约500个节点,并且在执行了几十个节点后,每个插件开始需要10秒钟。 My XML has the following general structure. 我的XML具有以下一般结构。

<realestatedata>
<agents>
    <author id="1">
        <name>Author_A</name>
    </author>
    <author id="2">
        <name>Author_B</name>
    </author>
    <portal id="1">
        <name>Portal_A</name>
    </portal>
</agents>
<artifacts>
    <document id="1">            
        <latitude>51.37392</latitude>
        <longitude>-0.00866</longitude>
        <bathroom_number>1</bathroom_number>
        <bedroom_number>3</bedroom_number>
        <price>365000</price>
    </document>
    <theme id="1">
        <name>Garden</name>
    </theme>
    <place id="1">
        <name>BR4</name>
        <location>
            <lat>51.37392</lat>
            <lon>-0.00866</lon>
        </location>
    </place>
</artifacts>
</realestatedata>

To ensure elements are placed at correct order, I am using the following code for insert update so a new record of its type is either the first one or is appended at the end of similar elements based on ids. 为了确保元素按正确顺序放置,我使用以下代码进行插入更新,因此其类型的新记录要么是第一个,要么基于ids附加在类似元素的末尾。

public void saveAuthor(Author author) {
    XQueryService xQueryService = null;
    CompiledExpression compiled = null;
    int currentId = authorIdSequence.get();
    StringWriter authorXml = new StringWriter();
    try {
        xQueryService = Utils.getXQeuryService();
        if (getAuthorByName(author.getName()) == null) {
            author.setId(String.valueOf(authorIdSequence.incrementAndGet()));
            marshaller.marshal(author, authorXml);
            if(currentId == 0){
                compiled = xQueryService
                        .compile("update insert " + authorXml.toString()
                                + " into //agents");
            }
            else{
                compiled = xQueryService
                        .compile("update insert " + authorXml.toString()
                                + " following //author[@id = '"+String.valueOf(currentId)+"']");
            }               
            xQueryService.execute(compiled);
        }

    } catch (XMLDBException e) {
        e.printStackTrace();
    } catch (JAXBException e) {
        e.printStackTrace();
    }
}

The same methods are executed for other elements like document, place etc. After a few updates, it gets very slow. 对文档,地点等其他元素执行相同的方法。经过一些更新后,它变得非常慢。 It starts taking up to ten seconds to insert one record. 插入一条记录开始需要十秒钟。

Only related links I could find are unswered. 只有我能找到的相关链接没有回复。

http://sourceforge.net/mailarchive/forum.php?thread_name=s2s508bb1471004190430h8b42ee99o3f1835a9bc873d58%40mail.gmail.com&forum_name=exist-development http://sourceforge.net/mailarchive/forum.php?thread_name=s2s508bb1471004190430h8b42ee99o3f1835a9bc873d58%40mail.gmail.com&forum_name=exist-development

http://exist.2174344.n4.nabble.com/Slow-xquery-quot-update-insert-quot-performance-tt4657541.html#none http://exist.2174344.n4.nabble.com/Slow-xquery-quot-update-insert-quot-performance-tt4657541.html#none

A few thoughts: 一些想法:

  • Attribute filters ( [@id=…] ) can be pretty slow when run on a large set of nodes. 在大量节点上运行时,属性过滤器( [@id=…] )可能会非常慢。 Consider that your code as posted will require eXist to check the @id of every previously inserted author before finding the right place to insert the new one. 考虑到你发布的代码将要求eXist检查每个先前插入的作者的@id ,然后找到插入新的代码的正确位置。 I can think of a few ways to solve this: 我可以想办法解决这个问题:
    1. A range index on @id 's would speed things up considerably. @id的范围索引会大大加快速度。
    2. Using @xml:id instead of @id would let you use id(…) which would be even faster yet. 使用@xml:id代替@id可以让你使用id(…) ,它会更快。 This would require changing your id's to be unique though (eg. "author_1", and "portal_1") 这需要将您的ID更改为唯一(例如“author_1”和“portal_1”)
    3. If you're really always incrementing your @id values, new nodes will always have the largest @id . 如果你真的总是递增你的@id值,新节点总是会有最大的@id In that case, following //author[last()] or even into //agents will work just fine. 在这种情况下, following //author[last()]或甚into //agents将工作得很好。
  • Doing many small inserts will always be slower than doing one big insert. 做许多小插入总是比做一个大插入慢。 If possible, delay saving new data to eXist until you have a bunch to do at once. 如果可能的话,延迟将新数据保存到eXist,直到你有一堆要做的事情为止。
  • Make sure the XQueryService s you're creating are getting released properly after you're done with them. 确保您创建的XQueryService在完成后正确发布。 Is Utils.getXQueryService() possibly keeping references it shouldn't? Utils.getXQueryService()可能保留引用它不应该?
  • Make sure you're not compounding overhead unnecessarily. 确保不会不必要地增加开销。 Can you reuse XQueryService s between calls? 你可以在调用之间重用XQueryService吗? If getAuthorByName() is querying eXist, can it be combined with the update query? 如果getAuthorByName()正在查询eXist,它是否可以与更新查询结合使用? Can you provide the node(s) to insert through a variable binding instead of as literals in the query so that you can reuse the same compiled query every time? 您是否可以通过变量绑定提供节点而不是查询中的文字插入,以便每次都可以重用相同的编译查询?

All that being said though, 10s is an awfully long time for a single insert if you only have 500 nodes. 尽管如此,如果你只有500个节点,10s对于单个插入来说是非常长的时间。 A quick test on my machine using the un-indexed "following" syntax to run a batch of updates in a single query can do the whole 500 in half that time. 在我的机器上使用未编制索引的“跟随”语法在单个查询中运行批量更新的快速测试可以在整个500个时间内完成整个500。 There's quite likely something larger going wrong that's not evident in your question. 在您的问题中,很可能出现更大的错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM