[英]exist-db update insert very slow
I am a beginner with exist-db. 我是exists-db的初学者。 I am building an xml document through Java.
我正在通过Java构建一个xml文档。 I process data through JAXB and then insert into exist-db resource through insert update.
我通过JAXB处理数据,然后通过插入更新插入到exists-db资源。 I am testing with around 500 nodes at this time and it starts taking up to 10 seconds per insert after a few dozen have executed.
我目前正在测试大约500个节点,并且在执行了几十个节点后,每个插件开始需要10秒钟。 My XML has the following general structure.
我的XML具有以下一般结构。
<realestatedata>
<agents>
<author id="1">
<name>Author_A</name>
</author>
<author id="2">
<name>Author_B</name>
</author>
<portal id="1">
<name>Portal_A</name>
</portal>
</agents>
<artifacts>
<document id="1">
<latitude>51.37392</latitude>
<longitude>-0.00866</longitude>
<bathroom_number>1</bathroom_number>
<bedroom_number>3</bedroom_number>
<price>365000</price>
</document>
<theme id="1">
<name>Garden</name>
</theme>
<place id="1">
<name>BR4</name>
<location>
<lat>51.37392</lat>
<lon>-0.00866</lon>
</location>
</place>
</artifacts>
</realestatedata>
To ensure elements are placed at correct order, I am using the following code for insert update so a new record of its type is either the first one or is appended at the end of similar elements based on ids. 为了确保元素按正确顺序放置,我使用以下代码进行插入更新,因此其类型的新记录要么是第一个,要么基于ids附加在类似元素的末尾。
public void saveAuthor(Author author) {
XQueryService xQueryService = null;
CompiledExpression compiled = null;
int currentId = authorIdSequence.get();
StringWriter authorXml = new StringWriter();
try {
xQueryService = Utils.getXQeuryService();
if (getAuthorByName(author.getName()) == null) {
author.setId(String.valueOf(authorIdSequence.incrementAndGet()));
marshaller.marshal(author, authorXml);
if(currentId == 0){
compiled = xQueryService
.compile("update insert " + authorXml.toString()
+ " into //agents");
}
else{
compiled = xQueryService
.compile("update insert " + authorXml.toString()
+ " following //author[@id = '"+String.valueOf(currentId)+"']");
}
xQueryService.execute(compiled);
}
} catch (XMLDBException e) {
e.printStackTrace();
} catch (JAXBException e) {
e.printStackTrace();
}
}
The same methods are executed for other elements like document, place etc. After a few updates, it gets very slow. 对文档,地点等其他元素执行相同的方法。经过一些更新后,它变得非常慢。 It starts taking up to ten seconds to insert one record.
插入一条记录开始需要十秒钟。
Only related links I could find are unswered. 只有我能找到的相关链接没有回复。
http://sourceforge.net/mailarchive/forum.php?thread_name=s2s508bb1471004190430h8b42ee99o3f1835a9bc873d58%40mail.gmail.com&forum_name=exist-development http://sourceforge.net/mailarchive/forum.php?thread_name=s2s508bb1471004190430h8b42ee99o3f1835a9bc873d58%40mail.gmail.com&forum_name=exist-development
http://exist.2174344.n4.nabble.com/Slow-xquery-quot-update-insert-quot-performance-tt4657541.html#none http://exist.2174344.n4.nabble.com/Slow-xquery-quot-update-insert-quot-performance-tt4657541.html#none
A few thoughts: 一些想法:
[@id=…]
) can be pretty slow when run on a large set of nodes. [@id=…]
)可能会非常慢。 Consider that your code as posted will require eXist to check the @id
of every previously inserted author before finding the right place to insert the new one. @id
,然后找到插入新的代码的正确位置。 I can think of a few ways to solve this: @id
's would speed things up considerably. @id
的范围索引会大大加快速度。 @xml:id
instead of @id
would let you use id(…)
which would be even faster yet. @xml:id
代替@id
可以让你使用id(…)
,它会更快。 This would require changing your id's to be unique though (eg. "author_1", and "portal_1") @id
values, new nodes will always have the largest @id
. @id
值,新节点总是会有最大的@id
。 In that case, following //author[last()]
or even into //agents
will work just fine. following //author[last()]
或甚into //agents
将工作得很好。 XQueryService
s you're creating are getting released properly after you're done with them. XQueryService
在完成后正确发布。 Is Utils.getXQueryService()
possibly keeping references it shouldn't? Utils.getXQueryService()
可能保留引用它不应该? XQueryService
s between calls? XQueryService
吗? If getAuthorByName()
is querying eXist, can it be combined with the update query? getAuthorByName()
正在查询eXist,它是否可以与更新查询结合使用? Can you provide the node(s) to insert through a variable binding instead of as literals in the query so that you can reuse the same compiled query every time? All that being said though, 10s is an awfully long time for a single insert if you only have 500 nodes. 尽管如此,如果你只有500个节点,10s对于单个插入来说是非常长的时间。 A quick test on my machine using the un-indexed "following" syntax to run a batch of updates in a single query can do the whole 500 in half that time.
在我的机器上使用未编制索引的“跟随”语法在单个查询中运行批量更新的快速测试可以在整个500个时间内完成整个500。 There's quite likely something larger going wrong that's not evident in your question.
在您的问题中,很可能出现更大的错误。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.