繁体   English   中英

django psql查询中的排序性能问题

[英]Sort performance issue in django psql query

我正在使用django和PostgreSQL,并且此查询可能需要8到10秒的时间,因此在性能方面存在很大问题。

我有一个模型“ Publication”,上面存储着Instagram出版物。 我正在尝试在某个城市内获得出版物,但是关系不是很直接,因此查询是:

instagram_publications = Publication.objects.filter(location__spot__city__name=location)

因此,在模型中,我们有:出版物[FK]->位置[FK]->竞标[FK]->城市。 所有这些模型也都继承自TimeStampedModel。

由于搜索是按城市名称进行的,因此我在City.name中添加了一个索引,设置为db_index = True,但没有任何更改。

我正在分析此查询并调用explain,并且看到与排序相关的巨额成本。 似乎它按从DatestampedModel继承的字段的创建日期和最后修改日期对行进行排序,我认为这种排序是不必要的,但是我不确定如何避免这种情况。

[PERFORMANCE ANALYSIS]> City filter Instagram
Sort  (cost=256874.73..257992.17 rows=446975 width=233) (actual time=294.240..343.831 rows=290637 loops=1)
  Sort Key: instanalysis_publication.modified DESC, instanalysis_publication.created DESC
  Sort Method: external merge  Disk: 60400kB
  ->  Nested Loop  (cost=1.00..114091.50 rows=446975 width=233) (actual time=0.055..110.515 rows=290637 loops=1)
        ->  Nested Loop  (cost=0.57..516.27 rows=2767 width=4) (actual time=0.044..3.145 rows=3374 loops=1)
              ->  Nested Loop  (cost=0.28..39.28 rows=504 width=4) (actual time=0.038..0.323 rows=829 loops=1)
                    ->  Seq Scan on instanalysis_city  (cost=0.00..1.10 rows=1 width=4) (actual time=0.011..0.013 rows=1 loops=1)
                          Filter: ((name)::text = 'Durban'::text)
                          Rows Removed by Filter: 7
                    ->  Index Scan using instanalysis_spot_c7141997 on instanalysis_spot  (cost=0.28..33.14 rows=504 width=8) (actual time=0.024..0.208 rows=829 loops=1)
                          Index Cond: (city_id = instanalysis_city.id)
              ->  Index Scan using instanalysis_instagramlocation_e72b53d4 on instanalysis_instagramlocation  (cost=0.29..0.89 rows=6 width=8) (actual time=0.001..0.003 rows=4 loops=829)
                    Index Cond: (spot_id = instanalysis_spot.id)
        ->  Index Scan using instanalysis_publication_e274a5da on instanalysis_publication  (cost=0.43..36.20 rows=485 width=233) (actual time=0.002..0.019 rows=86 loops=3374)
              Index Cond: (location_id = instanalysis_instagramlocation.id)
Planning time: 0.809 ms
Execution time: 355.928 ms

似乎也正在磁盘上进行排序,我想这是因为有成千上万的行,所以也许无法在内存中完成。

TimeStampedModel类来自django_extras包,其顺序在Meta中定义:

class TimeStampedModel(models.Model):
    """ TimeStampedModel
    An abstract base class model that provides self-managed "created" and
    "modified" fields.
    """
    created = CreationDateTimeField(_('created'))
    modified = ModificationDateTimeField(_('modified'))

    def save(self, **kwargs):
        self.update_modified = kwargs.pop('update_modified', getattr(self, 'update_modified', True))
        super(TimeStampedModel, self).save(**kwargs)

    class Meta:
        get_latest_by = 'modified'
        ordering = ('-modified', '-created',)
        abstract = True

有什么方法可以改善它,或者可以避免排序步骤吗?

谢谢

我终于找到了避免排序的方法,在发布模型中重写了Meta类,而只设置了ordering = None。

class Meta:
    #  Important: Override ordering inherited from TimeStampedModel to improve performance
    ordering = None

现在,我看不到该排序步骤的成本,但是无论运行EXPLAIN ANALIZE所说的是什么,当我运行它并测量与之前相同的时间时,都需要花费时间。

[PERFORMANCE ANALYSIS]> City filter Instagram
Nested Loop  (cost=1.00..110920.28 rows=452607 width=233) (actual time=0.069..96.667 rows=290637 loops=1)
  ->  Nested Loop  (cost=0.57..516.27 rows=2767 width=4) (actual time=0.056..2.476 rows=3374 loops=1)
        ->  Nested Loop  (cost=0.28..39.28 rows=504 width=4) (actual time=0.047..0.254 rows=829 loops=1)
              ->  Seq Scan on instanalysis_city  (cost=0.00..1.10 rows=1 width=4) (actual time=0.014..0.016 rows=1 loops=1)
                    Filter: ((name)::text = 'Durban'::text)
                    Rows Removed by Filter: 7
              ->  Index Scan using instanalysis_spot_c7141997 on instanalysis_spot  (cost=0.28..33.14 rows=504 width=8) (actual time=0.029..0.170 rows=829 loops=1)
                    Index Cond: (city_id = instanalysis_city.id)
        ->  Index Scan using instanalysis_instagramlocation_e72b53d4 on instanalysis_instagramlocation  (cost=0.29..0.89 rows=6 width=8) (actual time=0.001..0.002 rows=4 loops=829)
              Index Cond: (spot_id = instanalysis_spot.id)
  ->  Index Scan using instanalysis_publication_e274a5da on instanalysis_publication  (cost=0.43..34.99 rows=491 width=233) (actual time=0.002..0.016 rows=86 loops=3374)
        Index Cond: (location_id = instanalysis_instagramlocation.id)
Planning time: 1.385 ms
Execution time: 103.446 ms

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM