繁体   English   中英

我应该如何构建我的 MongoDB 复合索引?

[英]How should I structure my MongoDB compond index?

我有一个由以下字段组成的 mongo 图像元数据集合:camera_name(str)、photographer_name(str)、resolution(str)、image_size(int in MB, rounded) 和 timestamp(10 digit UNIX timestamp)

我只想运行 2 个查询:

  1. 给定 camera_name,返回时间戳 <= 1639457261 的记录(示例 UNIX 时间戳)。 记录必须按降序排序
  2. 给定相机名称、摄影师名称、分辨率、图像大小和时间戳,我想检索记录,按输入的时间戳的降序排序。

我创建了 2 个索引:

  1. { "camera_name": 1, "timestamp": -1 }
  2. { "camera_name": 1, "photographer_name": 1, "resolution": 1, "image_size": 1, "timestamp": -1}

第一个索引有效,但是当我对第二个索引运行查询时,没有返回任何记录。 我确信集合中存在记录,并且我希望在运行第二个查询时至少获得 10 条记录,但它返回一个空列表。

索引的配置方式有问题吗? 谢谢

这是示例数据:

{"camera_name": "Nikon", "photographer_name": "Aaron", "resolution": "1920x1080", "image_size": "3", "timestamp": 1397232415}
{"camera_name": "Nikon", "photographer_name": "Paul", "resolution": "1920x1080", "image_size": "4", "timestamp": 1717286853}
{"camera_name": "Nikon", "photographer_name": "Beth", "resolution": "720x480", "image_size": "1", "timestamp": 1503582086}
{"camera_name": "Nikon", "photographer_name": "Aaron", "resolution": "1920x1080", "image_size": "4", "timestamp": 1500628458}
{"camera_name": "Nikon", "photographer_name": "Paul", "resolution": "1920x1080", "image_size": "6", "timestamp": 1407580951}
{"camera_name": "Canon", "photographer_name": "Beth", "resolution": "1920x1080", "image_size": "5", "timestamp": 1166049453}
{"camera_name": "Canon", "photographer_name": "Paul", "resolution": "720x480", "image_size": "2", "timestamp": 1086317569}
{"camera_name": "Canon", "photographer_name": "Beth", "resolution": "720x480", "image_size": "1", "timestamp": 1400638926}
{"camera_name": "Canon", "photographer_name": "Aaron", "resolution": "720x480", "image_size": "1", "timestamp": 1345248762}
{"camera_name": "Canon", "photographer_name": "Paul", "resolution": "1920x1080", "image_size": "5", "timestamp": 1462360853}
{"camera_name": "Fuji", "photographer_name": "Beth", "resolution": "720x480", "image_size": "2", "timestamp": 1815298047}
{"camera_name": "Fuji", "photographer_name": "Shane", "resolution": "720x480", "image_size": "3", "timestamp": 1666493455}
{"camera_name": "Fuji", "photographer_name": "Beth", "resolution": "1920x1080", "image_size": "5", "timestamp": 1846677247}
{"camera_name": "Fuji", "photographer_name": "Beth", "resolution": "1920x1080", "image_size": "5", "timestamp": 1630996389}
{"camera_name": "Fuji", "photographer_name": "Shane", "resolution": "720x480", "image_size": "2", "timestamp": 1816829362}

我执行的查询:

  1. camera_name=Nikon and timestamp<=1503582086 应该返回 4 条记录
  2. camera_name='Fuji',photographer_name='Beth', resolution='1920x1080', image_size='5' and timestamp<=1900000000 应该返回 2 条记录,但我得到 0 条记录

索引不会“过滤”结果,它们允许您通过扫描索引树而不是扫描原始文档来更快地访问数据。

这意味着如果第二个查询“不返回任何内容”,它与您构建的任何索引都无关,但您使用的实际查询与数据库中的任何文档都不匹配。

我还将提到您的第二个索引可能会更小(取决于某些假设,如规模和数据分布),这可以帮助更新/插入性能,同时额外减少存储大小。 但是,从原始数据的外观来看,我认为这些并不是您的紧迫考虑。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM