简体   繁体   English

如何获取Hadoop中的像素RGB值?

[英]How to get pixel rgb value in hadoop?

I have millions of images stored in hdfs of hadoop. 我有数以百万计的图像存储在hadoop的hdfs中。 I want to build a index of these images. 我想建立这些图像的索引。 How to get pixel rgb values of these images? 如何获得这些图像的像素rgb值? I am new in hadoop, the image format in hadoop is different from the original image binary format. 我是hadoop的新手,hadoop中的图像格式与原始图像二进制格式不同。 Another problem is should I use the sequencefile in hadoop to pack the enormous images to a big file for efficiency? 另一个问题是我应该使用hadoop中的sequencefile将巨大的图像打包到一个大文件中以提高效率吗? Many thanks. 非常感谢。

I could answer the problem partially. 我可以部分回答这个问题。

Another problem is should I use the sequencefile in hadoop to pack the enormous images to a big file for efficiency? 另一个问题是我应该使用hadoop中的sequencefile将巨大的图像打包到一个大文件中以提高效率吗?

Depends on the size of the individual files. 取决于单个文件的大小。 If the individual files are really big, then consolidating them might not really help and the other way also. 如果单个文件确实很大,那么合并它们可能并没有帮助,反之亦然。

Check this query on SO for more details. 在SO上查询此查询以获取更多详细信息。

If you have the additional storage and efficiency is important to you I would definitely go with a SequenceFile. 如果您拥有额外的存储空间,并且效率对您很重要,那么我肯定会使用SequenceFile。 Hadoop will handle splitting the file up for you. Hadoop将为您处理拆分文件。 We ran into a case where we were extracting data from imagery file similar to what you are doing. 我们遇到了一种情况,即我们从图像文件中提取了与您正在执行的操作类似的数据。 In our case we were extracting metadata for ingestion in a discovery system so that our imagery files could be searched outside of the cluster. 在我们的案例中,我们正在提取元数据以在发现系统中进行提取,以便可以在集群外部搜索图像文件。 In this case because efficiency was not a big deal for us we just process the files individually making sure to make them not splittable. 在这种情况下,由于效率对我们而言并不重要,因此我们仅对文件进行单独处理以确保它们不可拆分。 This way the other system can reach back over http to grab the source files. 这样,其他系统就可以通过http返回以获取源文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM