简体   繁体   English

如何在Django RESTful API中提高200万数据查询速度

[英]How to improve 2 million data query speed in Django RESTful APIs

I have a scientific research publications data of 2 Million records. 我有一份有200万条记录的科研出版物数据。 I used django restframework to write apis for searching the data in title and abstract . 我使用django restframework编写apis来搜索titleabstract的数据。 This is taking me 12 seconds while using postgres as db, but if I used MongoDB as db, it goes down to 6seconds . 使用postgres作为db时,这需要12 seconds ,但如果我使用MongoDB作为db,则会降低到6 6seconds

But even 6 seconds sounds a lot of waiting for user to me. 但即便是6秒,也会让很多用户等待我。 I indexed the title and abstract , but abstract indexing failed because some of the abstract texts are too lengthy. 我将titleabstract编入索引,但抽象索引失败了,因为一些抽象文本太冗长了。

Here is the django Model using MongoDB(MongoEngine as ODM): 这是使用MongoDB(MongoEngine作为ODM)的django模型:

class Journal(Document):
    title = StringField()
    journal_title = StringField()
    abstract = StringField()
    full_text = StringField()
    pub_year = IntField()
    pub_date = DateTimeField()
    pmid = IntField()
    link = StringField()

How do I improve the query performance, what stack makes the search and retrieval more faster?. 如何提高查询性能,什么堆栈使搜索和检索更快?

Some pointers about optimisation for the Django ORM with Postgres: 关于使用Postgres优化Django ORM的一些指示:

  • Use db_index=True on fields that will be search upon often and have some degree of repetition between entries, like "title". 对将要经常搜索并在条目之间有一定程度重复的字段使用db_index=True ,如“title”。
  • Use values() and values_list() to select only the columns you want from a QuerySet. 使用values()values_list()仅从QuerySet中选择所需的列。
  • If you're doing full text search in any of those columns (like a contains query), bear in mind that Django has support for full text search directly on a Postgres database . 如果您正在任何这些列(如contains查询)中进行全文搜索 ,请记住Django 直接在Postgres数据库上支持全文搜索
  • Use print queryset.query to check what kind of SQL query is going into your database and if it can be improved upon. 使用print queryset.query来检查进入数据库的SQL查询类型以及是否可以对其进行改进。
  • Many Postgres optimisation techniques rely in custom SQL queries that can be made in Django by using RawSQL expressions. 许多Postgres优化技术依赖于可以使用RawSQL表达式在Django中进行的自定义SQL查询。
  • Remember that there are many, many ways to search for data in a database, be it relational or not-relational in nature. 请记住,有许多方法可以在数据库中搜索数据,无论是关系数据还是非关系数据。 In your case, MongoDB is not "faster" than Postgres, it's just doing a better job at querying what you really want. 在你的情况下,MongoDB并不比Postgres“更快”,它只是在查询你真正想要的东西方面做得更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM