简体繁体 English

图像识别

[英]Image Recognition

原文 2008-09-27 03:50:07 7 10 image/ image-processing/ pixel

I'd like to do some work with the nitty-gritties of computer imaging. 我想对计算机成像的细节做一些工作。 I'm looking for a way to read single pixels of data, analyze them programatically, and change them. 我正在寻找一种方法来读取单个像素的数据，以编程方式分析它们并更改它们。 What is the best language to use for this (Python, c++, Java...)? 什么是最好的语言（Python，c ++，Java ......）？ What is the best fileformat? 什么是最好的文件格式？

I don't want any super fancy software/APIs... I'm looking for the bare basics. 我不想要任何超级花哨的软件/ API ......我正在寻找基础知识。

10 个解决方案

If you need speed (you'll probably always want speed with image processing) you definitely have to work with raw pixel data. 如果你需要速度（你可能总是希望速度与图像处理）你肯定必须使用原始像素数据。 Java has some real disadvantages as you cannot access memory directly which makes pixel access quite slow compared to accessing the memory directly. Java有一些真正的缺点，因为你不能直接访问内存，这使得像素访问相比直接访问内存相当慢。 C++ is definitely the language of choice for production use image processing. C ++绝对是生产使用图像处理的首选语言。 But you can, for example, also use C# as it allows for unsafe code in specific areas. 但是，例如，您也可以使用C＃，因为它允许在特定区域中使用不安全的代码。 (Take a look at the scan0 pointer property of the bitmapdata class.) I've used C# successfully for image processing applications and they are definitely much faster than their java counterparts. （看一下bitmapdata类的scan0指针属性。）我已经成功地将C＃用于图像处理应用程序，它们肯定比它们的java同类产品快得多。 I would not use any scripting language or java for such a purpose. 我不会为了这个目的使用任何脚本语言或java。

It's very east to manipulate the large multi-dimensional or complex arrays of pixel information that are pictures using high-level languages such as Python . 使用Python等高级语言操纵像素信息的大型多维或复杂数组非常东方。 There's a library called PIL (the Python Imaging Library ) that is quite useful and will let you do general filters and transformations (change the brightness, soften, desaturate, crop, etc) as well as manipulate the raw pixel data. 有一个名为PIL（Python成像库）的库非常有用，它可以让你做一般的滤镜和转换（改变亮度，软化，去饱和，裁剪等）以及操作原始像素数据。

It is the easiest and simplest image library I've used to date and can be extended to do whatever it is you're interested in ( edge detection in very little code, for example). 它是我迄今为止使用过的最简单，最简单的图像库，可以扩展为您感兴趣的任何内容（例如，在非常少的代码中进行边缘检测）。

C / C ++不仅速度更快，而且您在C中找到的大多数图像处理示例代码也将在C中，因此您可以更轻松地合并您找到的内容。

I studied Artificial Intelligence and Computer Vision, thus I know pretty well the kind of tools that are used in this field. 我研究过人工智能和计算机视觉，因此我非常清楚这个领域中使用的那种工具。

Basically: you can use whatever you want as long as you know how it works behind the scene. 基本上：只要您知道它在幕后的工作方式，您就可以使用任何您想要的东西。

Now depending on what you want to achieve, you can either use: 现在，根据您想要实现的目标，您可以使用：

C language, but you will lose a lot of time in bugs checking and memory management when implementing your algorithms. C语言，但在实现算法时，您将在错误检查和内存管理方面浪费大量时间。 So theoretically, this is the fastest language to do that kind of job, but if your algorithms are not computationnally efficient (in terms of complexity) or if you lose too much time in bugs checking, this is clearly not worth it. 所以从理论上讲，这是做这种工作的最快的语言，但是如果你的算法在计算上没有效率（在复杂性方面），或者如果你在错误检查中浪费了太多时间，那么这显然是不值得的。 So I would advise to first implement your application in another language, and then later you can always optimize small parts of your code with C bindings. 所以我建议先用另一种语言实现你的应用程序，然后你总是可以使用C绑定优化代码的一小部分。
Octave/MatLab: very efficient language, almost as much as C, and you can make very elegant and succinct algorithms. Octave / MatLab：非常高效的语言，几乎和C一样多，你可以制作非常优雅和简洁的算法。 If you are into vectorization, matrix and linear operations, you should go with that. 如果您正在进行矢量化，矩阵和线性运算，那么您应该使用它。 However, you won't be able to develop a whole application with this language, it's more focused on algorithms, but then you can always develop an interface using another language later. 但是，您将无法使用此语言开发整个应用程序，它更专注于算法，但随后您可以随后使用其他语言开发界面。
Python: all-in-one elegant and accessible language, used in gigantically large scale applications such as Google and Facebook. Python：一体化优雅且易于使用的语言，用于巨大的大型应用程序，如Google和Facebook。 You can do pretty much everything you want with Python, any kind of application. 你可以用Python，任何类型的应用程序完成你想要的任何事情。 It will be perfectly adapted if you want to make a full application (with client interaction and all, not only algorithms), or if you want to quickly draft a prototype using existent libraries since Python has a very large set of high quality libraries, like OpenCV . 如果你想要一个完整的应用程序（使用客户端交互和所有，不仅是算法），或者如果你想使用现有的库快速起草原型，它将完全适应，因为Python有一个非常大的高质量库，如OpenCV 。 However if you only want to make algorithms, you should better use Octave/MatLab. 但是，如果您只想制作算法，则最好使用Octave / MatLab。

The answer that was selected as a solution is very biaised, and you should be careful about this kind of archaic comment. 选择作为解决方案的答案是非常适合的，你应该小心这种古老的评论。

Nowadays, hardware is cheaper than wetware (humans), and thus, you should use languages where you will be able to produce results faster, even if it's at the cost of a few CPU cycles or memory space. 如今，硬件比wetware（人类）便宜，因此，您应该使用能够更快地生成结果的语言，即使它是以几个CPU周期或内存空间为代价的。

Also, a lot of people tends to think that as long as you implement your software in C/C++, you are making the Saint Graal of speedness: this is just not true. 此外，很多人倾向于认为，只要你用C / C ++实现你的软件，你就会使圣格拉加速度：这是不正确的。 First, because algorithms complexity matters a lot more than the language you are using (a bad algorithm will never beat a better algorithm, even if implemented in the slowest language in the universe), and secondly because high-level languages are nowadays doing a lot of caching and speed optimization for you, and this can make your program run even faster than in C/C++. 首先，因为算法的复杂性比你正在使用的语言更重要（糟糕的算法永远不会击败更好的算法，即使用宇宙中最慢的语言实现），其次因为高级语言现在做了很多事情为您提供缓存和速度优化，这可以使您的程序运行速度比在C / C ++中更快。

Of course, you can always do everything of the above in C/C++, but how much of your time are you willing to waste to reinvent the wheel? 当然，你总能在C / C ++中完成上述所有工作，但是你有多少时间愿意浪费重新发明轮子？

(This might not apply for the OP who only wanted the bare basics -- but now that the speed issue was brought up, I do need to write this, just for the record.) （这可能不适用于只想要基本功能的OP - 但现在速度问题已经提出，我需要写这个，只是为了记录。）

If you really need speed, it's better to forget about working on the pixel-by-pixel level, and rather see whether the operations that you need to perform could be vectorized . 如果你真的需要速度，最好忘记按像素级别进行工作，而不是看看你需要执行的操作是否可以进行矢量化。 For example, for your C/C++ code you could use the excellent Intel IPP library (no, I don't work for Intel). 例如，对于您的C / C ++代码，您可以使用优秀的英特尔IPP库（不，我不适用于英特尔）。

如果你正在寻找你的图像的数字工作（思考矩阵），你进入Python检查http://www.scipy.org/PyLab-这基本上是在python中做matlab的能力，我的伙伴发誓它。

It depends a little on what you're trying to do. 这取决于你想要做什么。

If runtime speed is your issue then c++ is the best way to go. 如果运行速度是你的问题，那么c ++是最好的方法。

If speed of development is an issue, though, I would suggest looking at java. 但是，如果开发速度是一个问题，我建议看一下java。 You said that you wanted low level manipulation of pixels, which java will do for you. 你说你想要对像素进行低级操作，java会为你做。 But the other thing that might be an issue is the handling of the various file formats. 但另一件可能是问题的是处理各种文件格式。 Java does have some very nice APIs to deal with the reading and writing of various image formats to file (in particular the java2d library. You choose to ignore the higher levels of the API) Java确实有一些非常好的API来处理各种图像格式到文件的读写（特别是java2d库。你选择忽略更高级别的API）

If you do go for the c++ option (or python come to think of it) I would again suggest the use of a library to get you over the startup issues of reading and writing files. 如果您确实选择c ++选项（或者python会想到它），我会再次建议使用库来解决读取和写入文件的启动问题。 I've previously had success with libgd 我以前在libgd上取得了成功

What language do you know the best? 你最熟悉哪种语言？ To me, this is the real question. 对我来说，这是一个真正的问题。 If you're going to spend months and months learning one particular language, then there's no real advantage in using Python or Java just for their (to be proven) development speed. 如果你要花费几个月和几个月学习一种特定的语言，那么使用Python或Java仅仅因为它们（有待证明）的开发速度没有真正的优势。 I'm particularly proficient in C++ and I think that for this particular task I can be as speedy as a Java programmer, for example. 我特别精通C ++，我认为对于这个特定的任务，我可以像Java程序员一样快速。 With the aid of some good library (OpenCV comes to mind) you can create anything you need in a matter of a couple of lines of C++ code, really. 借助一些好的库（想到OpenCV），您可以在几行C ++代码中创建所需的任何东西。

简短回答：C ++和OpenCV

Short answer? 简短的回答？ I'd say C++, you have far more flexibility in manipulating raw chunks of memory than Python or Java. 我想说C ++，你在操作原始内存块方面比Python或Java有更大的灵活性。