简体   繁体   English

Linux 文件系统百万符号链接与百万文件

[英]Linux filesystem million symlinks vs million files

I'm working on a Linux filesystem-based caching system for a web application to be used as a last resort when APC and Memcache are unavailable.我正在为 web 应用程序开发基于 Linux 文件系统的缓存系统,以在 APC 和 Memcache 不可用时用作最后的手段。 The system will cache between 500,000 and 1,000,000 unique strings identifiers, each with a large value.系统将缓存 500,000 到 1,000,000 个唯一字符串标识符,每个标识符都有一个很大的值。 I'm taking the MD5 hash of the string ID and based on the first few chars, creating subfolders so not too many files end up in any one directory.我正在使用字符串 ID 的 MD5 hash 并根据前几个字符创建子文件夹,因此在任何一个目录中都不会出现太多文件。

I know this concepts works because I'm using it in a similar application.我知道这个概念有效,因为我在类似的应用程序中使用它。

Although there are up to 1MM string IDs, they all point one of only 18,000 unique values, so, for instance there might be 100,000 string IDs that all point to the same value.虽然最多有 1MM 个字符串 ID,但它们都指向仅有的 18,000 个唯一值之一,因此,例如,可能有 100,000 个字符串 ID 都指向相同的值。 Right now this means there are 100,000 files with different filenames containing the same content which is bad for the underlying filesystem cache.现在这意味着有 100,000 个具有不同文件名的文件包含相同的内容,这对底层文件系统缓存不利。

Is there any disadvantage to caching the 18,000 unique values, then for every unique string ID, creating a symlink to the unique value file?缓存 18,000 个唯一值是否有任何缺点,然后为每个唯一字符串 ID 创建指向唯一值文件的符号链接? This way the filesystem buffer can cache the 18,000 files and the descriptors for the symlinks.这样,文件系统缓冲区可以缓存 18,000 个文件和符号链接的描述符。

I'm just concerned about having 1,000,000 symlinks and any potential problems this may introduce.我只是担心有 1,000,000 个符号链接以及这可能引入的任何潜在问题。

Thanks in advance!提前致谢!

Compared to storing plain files, no there is no disadvantage to storing symlinks.与存储普通文件相比,存储符号链接没有任何缺点。 Performance will be slightly slower because of the indirection, but dentries and inodes are cached too.由于间接性,性能会稍微慢一些,也缓存了 dentry 和 inode。

However, I strongly suggest you need hard links, because that way, the content will stay around until the last of the links is deleted.但是,我强烈建议您需要链接,因为这样,内容将一直存在,直到最后一个链接被删除。

I agree with sehe , and please also note that hard links will use only 18,000 inodes instead of 10 6 ;我同意sehe ,还请注意硬链接将仅使用 18,000 个 inode而不是10 6 a hard link only uses an additional directory entry that points to the one and only inode.硬链接仅使用指向唯一 inode 的附加目录条目。 You will save 10 6 * inode size bytes on-disk and in your memory cache.您将在磁盘上和 memory 缓存中保存10 6 * inode size字节。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM