如何在Django RESTful API中提高200万数据查询速度

Question

I have a scientific research publications data of 2 Million records. 我有一份有200万条记录的科研出版物数据。 I used django restframework to write apis for searching the data in title and abstract . 我使用django restframework编写apis来搜索title和abstract的数据。 This is taking me 12 seconds while using postgres as db, but if I used MongoDB as db, it goes down to 6seconds . 使用postgres作为db时，这需要12 seconds ，但如果我使用MongoDB作为db，则会降低到6 6seconds 。

But even 6 seconds sounds a lot of waiting for user to me. 但即便是6秒，也会让很多用户等待我。 I indexed the title and abstract , but abstract indexing failed because some of the abstract texts are too lengthy. 我将title和abstract编入索引，但抽象索引失败了，因为一些抽象文本太冗长了。

Here is the django Model using MongoDB(MongoEngine as ODM): 这是使用MongoDB（MongoEngine作为ODM）的django模型：

class Journal(Document):
    title = StringField()
    journal_title = StringField()
    abstract = StringField()
    full_text = StringField()
    pub_year = IntField()
    pub_date = DateTimeField()
    pmid = IntField()
    link = StringField()

How do I improve the query performance, what stack makes the search and retrieval more faster?. 如何提高查询性能，什么堆栈使搜索和检索更快？

Answer 1

Some pointers about optimisation for the Django ORM with Postgres: 关于使用Postgres优化Django ORM的一些指示：

Use db_index=True on fields that will be search upon often and have some degree of repetition between entries, like "title". 对将要经常搜索并在条目之间有一定程度重复的字段使用db_index=True ，如“title”。
Use values() and values_list() to select only the columns you want from a QuerySet. 使用values()和values_list()仅从QuerySet中选择所需的列。
If you're doing full text search in any of those columns (like a contains query), bear in mind that Django has support for full text search directly on a Postgres database . 如果您正在任何这些列（如contains查询）中进行全文搜索，请记住Django 直接在Postgres数据库上支持全文搜索。
Use print queryset.query to check what kind of SQL query is going into your database and if it can be improved upon. 使用print queryset.query来检查进入数据库的SQL查询类型以及是否可以对其进行改进。
Many Postgres optimisation techniques rely in custom SQL queries that can be made in Django by using RawSQL expressions. 许多Postgres优化技术依赖于可以使用RawSQL表达式在Django中进行的自定义SQL查询。
Remember that there are many, many ways to search for data in a database, be it relational or not-relational in nature. 请记住，有许多方法可以在数据库中搜索数据，无论是关系数据还是非关系数据。 In your case, MongoDB is not "faster" than Postgres, it's just doing a better job at querying what you really want. 在你的情况下，MongoDB并不比Postgres“更快”，它只是在查询你真正想要的东西方面做得更好。

如何在Django RESTful API中提高200万数据查询速度

问题描述

1 个解决方案

解决方案1
4 2017-03-31 05:27:14

如何在Django RESTful API中提高200万数据查询速度

问题描述

1 个解决方案

解决方案1 4 2017-03-31 05:27:14

解决方案1
4 2017-03-31 05:27:14