简体   繁体   English

MongoDB gridFS-文件名长度,索引,性能

[英]MongoDB gridFS - filename length, indexing, performance

I'm studying gridFS and I have a few questions. 我正在学习gridFS,但有几个问题。

1) gridFS automatically indexing files by generated _id. 1)gridFS通过生成的_id自动索引文件。 But most of the time I get files by their filename, so should I create index on 'filename' by myself? 但是大多数时候我都是通过文件名来获取文件的,所以我应该自己在“文件名”上创建索引吗?

2) gridFS don't have folders, just filenames, but I can mimic folders by using filenames with slashes '/images/avatars/35.jpg', right? 2)gridFS没有文件夹,只有文件名,但是我可以通过使用带斜杠“ /images/avatars/35.jpg”的文件名来模仿文件夹,对吗?

3) If I'm indexing on "filename" - is it better in performance terms to use short filenames? 3)如果我在“文件名”上建立索引-从性能上来说,使用短文件名会更好吗? I mean - if I use user's _id which is 24 symbols long + suffixes, for example "/images/avatar_4f1d36b58e42ba3836ed178e_t.jpg" , wouldn't indexing on such long field slow down my system? 我的意思是-如果我使用长度为24个符号+后缀的用户_id,例如"/images/avatar_4f1d36b58e42ba3836ed178e_t.jpg" ,那么在如此长的字段上建立索引是否会减慢系统速度? Would it be better (faster) to use short user's login instead of _id? 使用短用户登录名而不是_id会更好(更快)吗?

1) I'd very surprised if the filename weren't indexed. 1)如果文件名没有被索引,我会感到非常惊讶。 It's used throughout the API, and I assume that it is indexed. 它在整个API中都使用过,并且我假设它已被索引。

2) Yes, you can, but there is no real notion of directories implied. 2)是的,可以,但是没有隐含目录的实际概念。 Listing files/dirs is a bit more complicated. 列出文件/目录有点复杂。 In other words it's just a label. 换句话说,它只是一个标签。

3) Indexes use hashes, or fixed length strings, so a long key is just as easy to index as a long one. 3)索引使用哈希或固定长度的字符串,因此长键与长键一样容易索引。

1) The specification doesn't require the filename to be indexed. 1)规范不需要索引文件名。 You would want to check the code in your driver, or just make an index yourself. 您可能想检查驱动程序中的代码,或者自己创建索引。 One thing that you should take into consideration is that filenames don't have to be unique. 您应该考虑的一件事是文件名不必唯一。 You might reconsider your design, and query on the _id instead. 您可以重新考虑设计,然后在_id上查询。

2) Yes. 2)是的。

3) The b-tree indexes in mongodb do not use hashes. 3)mongodb中的b树索引不使用哈希。 A larger string will take more space in the index, thus taking more RAM, but I don't think performance will be affected much (unless you count using more RAM as taking a performance hit). 较大的字符串将在索引中占用更多空间,从而占用更多的RAM,但是我认为性能不会受到太大的影响(除非您将使用更多的RAM视为对性能造成了影响)。 A good rule of thumb for mongodb is that your indexes (and your "working set") should fit in RAM. mongodb的一个好的经验法则是,索引(和“工作集”)应该适合RAM。 If you can rework your application to query on the _id instead of the filename, you wouldn't have to worry about the space for this index. 如果您可以重做应用程序以查询_id而不是文件名,则不必担心该索引的空间。

GridFS在_id上有一个默认索引(很明显),在filenameuploadDate上有一个复合索引。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM