简体   繁体   English

图像数据库管理

[英]database management for images

Little bit of background information: The previous project manager was fired due to not delivering the project on time. 一点背景信息:由于没有按时交付项目,之前的项目经理被解雇了。 I have little experiencing coding, but am now leading the team to finish the website. 我几乎没有经历过编码,但现在正带领团队完成网站。

The website itself is similar with Ebay where an item is added for sale. 该网站本身与Ebay类似,其中添加了一个项目进行销售。 Images and documents will are associated with the item, but hosted in folders that are created when the image is uploaded. 图像和文档将与项目关联,但托管在上载图像时创建的文件夹中。 The dev team has asked me "how to manage the folders with the documents in relation to the item listing". 开发团队问我“如何使用与项目列表相关的文档来管理文件夹”。 There will be between 1-10 images/documents uploaded per item and will be between 1000-2000 items listed at one point in time (if not more). 每个项目将上载1-10个图像/文档,并且将在一个时间点列出的1000-2000个项目之间(如果不是更多)。

From looking around, I believe the easiest solution is to name the folder by the item number and list the reference in MySql. 从环顾四周来看,我认为最简单的解决方案是按项目编号命名文件夹并在MySql中列出引用。 Each item will have an individual item number and should be no duplicates. 每个项目都有一个单独的项目编号,不应重复。 Are there better solutions for the folder management? 是否有更好的文件夹管理解决方案?

正如先生所说,可以使用productid-docid-imageid-timestamp重命名图像如果不经常检索图像,则将图像作为blob存储在db中,并且打印具有不同名称的图像可能会有所帮助。

What you want to be careful with is that most filesystems have a limit on how many items can be stored in a folder; 您要小心的是,大多数文件系统都限制了文件夹中可以存储的项目数量; on Linux the limit is typically around 30000. With the numbers you give there should be little concern there, but you should still plan for the system to be future proof. 在Linux上,限制通常在30000左右。你提供的数字应该没有什么值得关注的,但你仍然应该计划将来的系统。

I have found it to be quite useful to store images by their hash. 我发现通过哈希存储图像非常有用。 For instance, create a SHA1 hash of the image, eg: cce7190663c547d026a6bf8fc8d2f40b3b1b9ea5 . 例如,创建图像的SHA1哈希,例如: cce7190663c547d026a6bf8fc8d2f40b3b1b9ea5 Then store the image in a directory structure based on this hash with a few levels of folders: 然后将图像存储在基于此哈希的目录结构中,并具有几个级别的文件夹:

cce/719/066/3c5/cce7190663c547d026a6bf8fc8d2f40b3b1b9ea5.jpg

This uses the first 12 characters of the hash to form a folder structure 4 levels deep, then the file name is the entire hash. 这使用散列的前12个字符来形成4级深的文件夹结构,然后文件名是整个散列。 Increase or decrease the folder depth as necessary. 根据需要增加或减少文件夹深度。 This allows you to store quite a lot of images (((16^3)^4) * limit) without hitting the filesystem limits. 这允许您存储相当多的图像(((16 ^ 3)^ 4)*限制)而不会达到文件系统限制。 You then save this path in a database with other information about the image and which items it belongs to. 然后,将此路径保存在数据库中,其中包含有关图像的其他信息及其所属的项目。 This method also effectively de-duplicates your data storage, you'll never store the same image twice. 此方法还可以有效地重复数据存储,您永远不会存储两次相同的图像。

It used to be that filesystem performance would deteriorate if there were too many files in a directory, so the common wisdom was to limit to ~1,000 items in any directory. 过去,如果目录中的文件太多,文件系统性能会下降,因此常见的做法是限制任何目录中的~1,000个项目。

Try creating a directory structure around the item_id (padded), so #1002003 might be 001002003, which could be found in 001/002/001002003.jpg. 尝试在item_id(填充)周围创建目录结构,因此#1002003可能是001002003,可以在001/002 / 001002003.jpg中找到。

Since you're storing more than one image per item, you might have one more level, eg 001/002/003/001002003_1.jpg. 由于每个项目存储多个图像,因此可能还有一个级别,例如001/002/003/001002003_1.jpg。

Use the full ID as the item's name in the final directory (001002003.jpg, not 003.jpg). 在最终目录中使用完整ID作为项目名称(001002003.jpg,而不是003.jpg)。 It'll come in handy later. 它会在以后派上用场。

Hope that helps. 希望有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM