简体   繁体   English

如何在预先存在的SQL数据库之上使用Elastic Search?

[英]How to use Elastic Search on top of a pre-existing SQL Database?

I've been reading through a lot of good documentation about how to implement Elastic Search on a website with javascript or PHP. 我一直在阅读很多关于如何在javascript或PHP网站上实现Elastic Search的好文档。

Very good introduction to ES . ES非常好的介绍

Very complete documentation here and here . 这里这里都有完整的文档。

A whole CRUD . 整个CRUD

Elastic search with PHP: here , here , and here . 使用PHP进行弹性搜索: 此处此处此处

So the reason why I'm giving you those URLs is to understand how to use one or many of those great documentations when having a pre-existing SQL DB. 因此,我给你这些URL的原因是要了解如何在拥有预先存在的SQL DB时使用其中一个或多个优秀的文档。

I'm missing the point somewhere: As they said Elasticsearch will create its own indexes and DB with MongoDB, I don't understand how can I use my (gigantic) database using SQL? 我在某处错过了一点:正如他们说Elasticsearch将使用MongoDB创建自己的索引和数据库,我不明白如何使用SQL来使用我的(巨大的)数据库? Let say I have a MySQL DB, and I would like to use Elasticsearch to make my research faster and to propose the user pre-made queries, how do I do that? 假设我有一个MySQL数据库,我想使用Elasticsearch使我的研究更快,并提出用户预先制作的查询,我该怎么做? How does ES works over/along MySQL? ES如何在MySQL上工作? How to transfer this gigantic set of Datas (over 8GB) into ES DB in order to be fully efficient at the beginning? 如何将这个巨大的数据集(超过8GB)传输到ES DB中,以便在开始时充分发挥作用?

Many Thanks 非常感谢

I am using jdbc-river w/ mysql. 我正在使用jdbc-river w / mysql。 It is very fast. 它非常快。 You can configure them to continually poll data, or use one-time (one-shot strategy) imports. 您可以将它们配置为持续轮询数据,或使用一次性(一次性策略)导入。

eg 例如

curl -xPUT http://es-server:9200/_river/my_river/_meta -d '
{
    "type" : "jdbc",
    "jdbc" : {
        "strategy" : "simple",
        "poll" : "5s",
        "scale" : 0,
        "autocommit" : false,
        "fetchsize" : 10,
        "max_rows" : 0,
        "max_retries" : 3,
        "max_retries_wait" : "10s",
        "driver" : "com.mysql.jdbc.Driver",
        "url" : "jdbc:mysql://mysql-server:3306/mydb",
        "user" : "root",
        "password" : "password*",
        "sql" : "select c.id, c.brandCode, c.companyCode from category c"
    },
    "index" : {
        "index" : "mainIndex",
        "type" : "category",
        "bulk_size" : 30,
        "max_bulk_requests" : 100,
        "index_settings" : null,
        "type_mapping" : null,
        "versioning" : false,
        "acknowledge" : false
    }
}'

If you need a more performant and scalable solution to the polling offered by jdbc-river, I recommend that you watch this presentation that explains how to perform incremental syncing from SQL Server into Elastic Search: 如果您需要针对jdbc-river提供的轮询提供更高性能和可扩展的解决方案,我建议您观看此演示文稿,该演示文稿说明如何从SQL Server执行增量同步到弹性搜索:

The principles discussed in the video also apply for other RDBMS -> NoSQL replication applications. 视频中讨论的原则也适用于其他RDBMS - > NoSQL复制应用程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM