简体   繁体   English

仅Solr与Solr / MySQL解决方案

[英]Solr only vs. Solr/MySQL solution

Currently I have a system, which is based solely on Solr. 目前我有一个完全基于Solr的系统。 Which means, that I store all data in Solr (using SolrJ) with no other datastore involved. 这意味着,我将所有数据存储在Solr中(使用SolrJ),而不涉及其他数据存储。 The problem is now, that I experience some performance issues. 问题是,我遇到了一些性能问题。 I thought, that it maybe could make sense to store in MySQL and then synchronize the data with Solr with eg the DataImportHandler . 我想,可能有意义的是存储在MySQL中,然后将数据与Solr同步,例如DataImportHandler So that I have the reading operations on the Solr index and the main writing operations in MySQL and then sometimes only Solr-Writing operations when synchronizing with Solr. 因此,我对Solr索引和MySQL中的主要写入操作进行了读取操作,有时在与Solr同步时只进行Solr-Writing操作。

The thing is that I expect hundreds of millions documents which should be stored and I don't really now if that the MySQL/Solr makes sense. 问题是,我希望存储数亿个文档,如果MySQL / Solr有意义的话,我现在还不知道。

Is there another better solution? 还有其他更好的解决方案吗 Maybe Master-Solr for writing and Solr-slaves for reading? 也许Master-Solr写作和Solr-Slaves阅读?

Update : What I forgot to say is, that also in case of a schema.xml change, the "storing data in MySQL" solution could be useful in my opinion, because then I can re-commit all the data without caring about Solr's self-stored data. 更新 :我忘了说的是,在schema.xml更改的情况下,“在MySQL中存储数据”解决方案在我看来可能很有用,因为那时我可以重新提交所有数据而不关心Solr的自我存储的数据。

Its not preferable to use the same Solr instance for both reading and writing as the activities (with commit and optimize) on Solr during writing would heavily impact the read operations. 由于写入期间Solr上的活动(使用提交和优化)会严重影响读取操作,因此不宜在读取和写入时使用相同的Solr实例。

Master - Slave confgurations would be nicer approach, with master primarily for writes and slaves for read only purposes. Master - Slave配置将是更好的方法,主要用于写入和从属为只读目的。
Slaves being periodically refreshed with the contents from Master. 奴隶定期刷新Master的内容。 (So there would be some delay) (所以会有一些延迟)
You can always scale by adding multiple slaves. 您始终可以通过添加多个从站进行缩放。

Using MySQL as a persistant store with Master-Slave Solr would be a best approach. 使用MySQL作为Master-Slave Solr的持久存储将是一种最好的方法。
MySQL providing a stable data store, and would guard you against index corruption or some more issues which would result in data lost. MySQL提供稳定的数据存储,可以防止索引损坏或一些导致数据丢失的问题。
Using dataimport handler you can do it easily with incremental updates, but there would be more time tag for latest data to appear on slaves. 使用dataimport处理程序,您可以使用增量更新轻松完成,但是有更多时间标记可以在从属服务器上显示最新数据。
With this you can also use Index swapping for full refreshes. 使用此功能,您还可以使用索引交换进行完全刷新。

In case the index grows up hugh to be be maintainable and has performance impact, you may want to check solr shards. 如果索引长大,可以维护并且对性能有影响,您可能需要检查solr分片。

I also thought about the same issue: storing everything in solr or stor in mySql and index in Solr. 我也考虑过同样的问题:将所有内容存储在mySql中的solr或stor中,并将索引存储在Solr中。

I decided to go the 2nd way: store with MySQL and index in solr. 我决定采用第二种方式:在solr中存储MySQL和索引。

The reason: handling of data (reading and writing data) in MySql is much better than by Solr. 原因是:在MySql中处理数据(读取和写入数据)要比Solr好得多。 Also data import/export from/to MySql is supported/possible by lots of tools, out of the box. 此外,许多工具都支持/可以从/向MySql导入/导出数据。 Next Point: Backup. 下一点:备份。 There are much more established ways for backing up an MySql DB than an Solr index. 备份MySql DB的方法比Solr索引要多得多。

Of course, for fulltext-search, Solr is much more better than MySql. 当然,对于全文搜索,Solr比MySql好得多。 So i decided, that everyone should have to work where he knows best. 所以我决定,每个人都应该在他最了解的地方工作。 For your Information: i'm talking about an medium Index: 4GB for some million documents. 对于您的信息:我正在谈论一个中等索引:4GB的数百万文档。

//Edit: don't forgett, that some features requiere stared data in lucene (not only indexed), like highlighting. //编辑:不要忘记,有些功能需要在lucene(不仅仅是索引)中查看数据,例如突出显示。 If you need this, you have to store the documents in solr (additional). 如果需要,您必须将文档存储在solr(附加)中。 An alternative way could be implementing those features on client-side. 另一种方法是在客户端实现这些功能。 (I did it this way) (我是这样做的)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM