简体   繁体   English

优化Rails数据库搜索

[英]Optimize Searching Through Rails Database

I'm building a rails project, and I have a database with a set of tables.. each holding between 500k and 1M rows, and i am constantly creating new rows. 我正在构建一个rails项目,我有一个包含一组表的数据库..每个表包含500k到1M行,我不断创建新行。

By the nature of the project, before each creation, I have to search through the table for duplicates (for one field), so i don't create the same row twice. 根据项目的性质,在每次创建之前,我必须在表中搜索重复项(对于一个字段),因此我不会创建两次相同的行。 Unfortunately, as my table is growing, this is taking longer and longer. 不幸的是,随着我的桌子越来越大,这需要更长的时间。

I was thinking that I could optimize the search by adding indexes to the specific String fields through which i am searching.. but I have heard that adding indexes increases the creation time. 我想我可以通过向我正在搜索的特定字符串字段添加索引来优化搜索。但我听说添加索引会增加创建时间。

So my question is as follows: What is the trade off with finding and creating rows which contain fields that are indexed? 所以我的问题如下:找到并创建包含被索引字段的行的权衡是什么? I know adding indexes to the fields will cause my program to be faster with the Model.find_by_name.. but how much slower will it make my row creation? 我知道在字段中添加索引会使我的程序使用Model.find_by_name更快..但是我的行创建速度会慢多少?

Indexing slows down insertation of entries because its required to add the entry to the index and that needs some ressources but once added they speed up your select queries, thats like you said BUT maybe the b-tree isnt the right choice for you! 索引会减慢条目的插入,因为它需要将条目添加到索引并且需要一些资源但是一旦添加它们就会加速您的选择查询,就像你说的那样但是 b树也许不是你的正确选择! Because the B-Tree indexes the first X units of the indexed subject. 因为B树索引索引主题的前X个单位。 Thats great when you have integers but text search is tricky. 如果你有整数,这很好,但文本搜索很棘手。 When you do queries like 当你做查询时

Model.where("name LIKE ?", "#{params[:name]}%")

it will speed up selection but when you use queries like this: 它会加快选择速度但是当你使用这样的查询时:

Model.where("name LIKE ?", "%#{params[:name]}%")

it wont help you because you have to search the whole string which can be longer than some hundred chars and then its not an improvement to have the first 8 units of a 250 char long string indexed! 它不会帮助你,因为你必须搜索整个字符串,这可能超过几百个字符,然后它不是一个改进,让250个字符串长字符串的前8个单位索引! So thats one thing. 这就是一件事。 But theres another.... 但另一个......

You should add a UNIQUE INDEX because the database is better in finding duplicates then ruby is! 您应该添加一个UNIQUE INDEX,因为数据库更好地查找重复项,然后是ruby! Its optimized for sorting and its definitifly the shorter and cleaner way to deal with this problem! 它针对分类进行了优化,并且确定了处理这个问题的更短更清洁的方法! Of cause you should also add a validation to the relevant model but thats not a reason to let things lide with the database. 因为你还应该为相关模型添加一个验证,但这不是让事情与数据库挂钩的理由。

// about index speed //关于索引速度

http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html

You dont have a large set of options. 你没有很多选择。 I dont think the insert speed loss will be that great when you only need one index! 当你只需要一个索引时,我不认为插入速度损失会很大! But the select speed will increase propotionall! 但选择速度会增加propotionall!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM