简体   繁体   English

单索引多类型-通过轮胎进行Elasticsearch分度

[英]Single index multi type - elasticsearch indexing via tire

In my multi-tenant app (account based with number of users per account), how would I update index for a particular account when a user document is changed. 在我的多租户应用程序(基于每个帐户用户数的帐户)中,当用户文档更改时,我将如何更新特定帐户的索引。

I have a separate index for each account, in which the mappings for each model (user and comments - just an example actual app has many models) are specified. 我为每个帐户都有一个单独的索引,在其中指定了每个模型的映射(用户和注释-仅示例实际应用程序具有许多模型)。 In this case if any change has been done for user model or comment model, the index that has been created for the related account has to be updated. 在这种情况下,如果对用户模型或注释模型进行了任何更改,则必须更新为相关帐户创建的索引。 Is this possible? 这可能吗? Please let me know if yes. 如果可以,请告诉我。

I guess this is the way I specify the mappings in my case. 我猜这是我指定情况下映射的方式。 Correct me if I'm wrong. 如果我错了纠正我。

Account Model: 帐户模型:

include Tire::Model::Search

Tire.index('account_1') do
  create(
    :mappings => {
      :user => {
        :properties => {
          :name => { :type => :string, :boost => 10 },
          :company_name => { :type => :string, :boost => 5 }
        }
      },
      :comments => {
        :properties => {
          :description => { :type => :string, :boost => 5 }
        }
      }
    }
  )
end

The index is getting created correctly with both the mappings for account index. 两种帐户索引的映射都正确创建了索引。 But, I don't see a way where I can update the index when any model specified in the mappings are changed. 但是,当映射中指定的任何模型发生更改时,我都看不到更新索引的方法。

Whenever a new user is added or if an user is updated the index created for the corresponding account has to be updated. 每当添加新用户或更新用户时,都必须更新为相应帐户创建的索引。

This question is cross-posted from Github issue Multiple model single index approach . 这个问题来自Github的多模型单索引方法 Crossposting the answer here. 在这里交叉张贴答案。


Let's say we have an Account class and we deal in articles entities. 假设我们有一个Account类,并且我们处理商品实体。

In that case, our Account class would have following: 在这种情况下,我们的Account类将具有以下内容:

class Account
  #...

  # Set index name based on account ID
  #
  def articles
      Article.index_name "articles-#{self.id}"
      Article
  end
end

So, whenever we need to access articles for a particular account, either for searching or for indexing, we can simply do: 因此,只要我们需要访问特定帐户的文章(用于搜索或建立索引),我们都可以简单地执行以下操作:

@account = Account.find( remember_token_or_something_like_that )

# Instead of `Article.search(...)`:
@account.articles.search { query { string 'something interesting' } }

# Instead of `Article.create(...)`:
@account.articles.create id: 'abc123', title: 'Another interesting article!', ...

Having a separate index per user/account works perfect in certain cases -- but definitely not well in cases where you'd have tens or hundreds of thousands of indices (or more). 在某些情况下,每个用户/帐户拥有一个单独的索引非常合适-但在您拥有成千上万个索引(或更多)的情况下,绝对不行。 Having index aliases, with properly set up filters and routing, would perform much better in this case. 在这种情况下,具有索引别名以及正确设置的筛选器和路由会更好。 We would slice the data not based on the tenant identity, but based on time. 我们将不根据租户身份对数据进行切片,而是根据时间对数据进行切片。

Let's have a look at a second scenario, starting with a heavily simplified curl http://localhost:9200/_aliases?pretty output: 让我们看一下第二种情况,从大大简化的curl http://localhost:9200/_aliases?pretty输出开始:

{
  "articles_2012-07-02" : {
    "aliases" : {
      "articles_plan_pro" : {
      }
    }
  },
  "articles_2012-07-09" : {
    "aliases" : {
      "articles_current" : {
      },
      "articles_shared" : {
      },
      "articles_plan_basic" : {
      },
      "articles_plan_pro" : {
      }
    }
  },
  "articles_2012-07-16" : {
    "aliases" : {
    }
  }
}

You can see that we have three indices, one per week. 您可以看到我们有三个索引,每周一个。 You can see there are two similar aliases: articles_plan_pro and articles_plan_basic -- obviously, accounts with the “pro” subscription can search two weeks back, but accounts with the “basic” subscription can search only this week. 您可以看到有两个类似的别名:articles_plan_pro和articles_plan_basic-显然,具有“ pro”订阅的帐户可以在两周后搜索,但是具有“ basic”订阅的帐户只能在本周搜索。

Notice also, that the the articles_current alias points to, ehm, current week (I'm writing this on Thu 2012-07-12). 还要注意,articles_current别名指向,例如,本周(我在2012年4月12日星期四撰写)。 The index for next week is just there, laying and waiting -- when the time comes, a background job (cron, Resque worker, custom script, ...) will update the aliases. 下周的索引就在这里,等待和放置-当时间到了时,后台作业(cron,Resque worker,自定义脚本等)将更新别名。 There's a nifty example with aliases in “sliding window” scenario in the Tire integration test suite. Tire集成测试套件中的“滑动窗口”场景中有一个带有别名的漂亮示例。

Let's not look on the articles_shared alias right now, let's look at what tricks we can play with this setup: 现在,让我们不看看articles_shared别名,让我们看看使用此设置可以使用的技巧:

class Account
  # ...

  # Set index name based on account subscription
  #
  def articles
    if plan_code = self.subscription && self.subscription.plan_code
      Article.index_name "articles_plan_#{plan_code}"
    else
      Article.index_name "articles_shared"
    end
    return Article
  end
end

Again, we're setting up an index_name for the Article class, which holds our documents. 再次,我们为Article类设置一个index_name,其中包含我们的文档。 When the current account has a valid subscription, we get the plan_code out of the subscription, and direct searches for this account into relevant index: “basic” or “pro”. 当当前帐户具有有效的订阅时,我们从订阅中获取plan_code,然后直接在相关索引中搜索该帐户:“基本”或“专业”。

If the account has no subscription -- he's probably a “visitor” type -- , we direct the searches to the articles_shared alias. 如果该帐户没有订阅(他可能是“访客”类型),我们会将搜索定向到articles_shared别名。 Using the interface is as simple as previously, eg. 使用该界面与以前一样简单,例如。 in ArticlesController: 在ArticlesController中:

@account  = Account.find( remember_token_or_something_like_that )
@articles = @account.articles.search { query { ... } }
# ...

We are not using the Article class as a gateway for indexing in this case; 在这种情况下,我们不使用Article类作为索引的网关; we have a separate indexing component, a Sinatra application serving as a light proxy to elasticsearch Bulk API, providing HTTP authentication, document validation (enforcing rules such as required properties or dates passed as UTC), and uses the bare Tire::Index#import and Tire::Index#store APIs. 我们有一个单独的索引编制组件,一个Sinatra应用程序,充当Elasticsearch Bulk API的轻型代理,提供HTTP身份验证,文档验证(强制执行规则,如所需属性或以UTC形式传递的日期),并使用裸Tyre :: Index#import和Tire :: Index#store API。

These APIs talk to the articles_currentindex alias, which is periodically updated to the current week with said background process. 这些API会与articles_currentindex别名通信,该别名会通过所述后台进程定期更新到当前星期。 In this way, we have decoupled all the logic for setting up index names in separate components of the application, so we don't need access to the Article or Account classes in the indexing proxy (it runs on a separate server), or any component of the application. 这样,我们就取消了在单独的应用程序组件中设置索引名称的所有逻辑,因此我们不需要访问索引代理(它在单独的服务器上运行)中的Article或Account类。应用程序的组件。 Whichever component is indexing, indexes against articles_current alias; 无论索引的是哪个组件,都将根据articles_current别名进行索引; whichever component is searching, searches against whatever alias or index makes sense for the particular component. 无论要搜索哪个组件,都将搜索对该特定组件有意义的任何别名或索引。

You probably want to use another gem like rubberband https://github.com/grantr/rubberband to set up the index the way you want it, beforehand, maybe on account creation you do it in the after_create callback 您可能想要使用另一个像橡皮筋的宝石https://github.com/grantr/rubberband来以您想要的方式预先设置索引,也许在帐户创建时,您可以在after_create回调中进行操作

Then in mapping your User and Comment model you can use Tire to do something like this: 然后,在映射用户和评论模型时,您可以使用Tire进行如下操作:

tire.mapping :_routing => { :required => true, :path => :account_id } do
  index_name 'account_name_here'
  ...
  ...
end

the tricky part will be getting the account_id or name into that index_name string/argument, might be easy or difficult, haven't tried dynamically assigning index_name yet 棘手的部分是将account_id或名称添加到该index_name字符串/参数中,可能很容易,也可能很困难,尚未尝试动态分配index_name

hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM