简体繁体 English

Keras 中单个输入的多个输出向量

[英]Multiple Output Vectors for a single Input in Keras

原文 2017-10-27 09:28:56 8 1 python/ tensorflow/ deep-learning/ keras/ ocr

I want to create a Neural Network in Keras for converting handwriting into computer letters.我想在Keras创建一个神经网络，用于将笔迹转换为计算机字母。

My first step is to convert a sentence into an Array.我的第一步是将一个句子转换成一个数组。 My Array has the shape (1, number of letters,27) .我的数组的形状为(1, number of letters,27) 。 Now I want to input it in my Deep Neural Network and train.现在我想将它输入到我的深度神经网络中并进行训练。

But how do I input it properly if the dimension doesn't fit those from my image?但是，如果尺寸不适合我图像中的尺寸，我该如何正确输入？ And how do I achieve that my predict function gives me an output array of (1, number of letters,27) ?我如何实现我的 predict 函数给我一个(1, number of letters,27)的输出数组？

1 个解决方案

Seems like you are attempting to do Handwritten Recognition or similarly Optical Character Recognition or OCR.似乎您正在尝试进行手写识别或类似的光学字符识别或 OCR。 This is quite a broad field, and there are many ways to proceed.这是一个相当广阔的领域，有很多方法可以进行。 Even though, one approach I suggest is the following:尽管如此，我建议的一种方法如下：

It is commonly known that Neural Networks have fixed size inputs , that is if you build it to take, say, inputs of shape (28,28,1) then the model will expect that shape as their inputs.众所周知，神经网络具有固定大小的输入，也就是说，如果您构建它以接受形状(28,28,1)输入，那么模型将期望该形状作为其输入。 Therefore, having a dimension in your samples that depends on the number of letters in a sentence (something variable) is not recommended , as you will not be able to train a model in such way with NNs.因此，不建议在您的样本中使用取决于句子中字母数量（某些变量）的维度，因为您将无法使用 NN 以这种方式训练模型。

Training such a model could be possible if you design it to predict one character at a time , instead a whole sentence that can have different lengths, and then group the predicted characters.如果您将其设计为一次预测一个字符，而不是可以具有不同长度的整个句子，然后将预测的字符分组，则可以训练这样的模型。 The steps you could try to achieve this could be:您可以尝试实现的步骤可能是：

Obtain training samples for the characters you wish to recognize (like the MNIST database for example), and design and train your model to predict one character at a time.获取您希望识别的字符的训练样本（例如MNIST数据库），并设计和训练您的模型以一次预测一个字符。
Take the image with writing to classify and pass a Sliding Window over it that matches your expected input size (say a 28x28 window).用文字对图像进行分类并在其上传递一个与您预期的输入大小相匹配的滑动窗口（例如28x28窗口）。 Then, classify each of those windows to a character.然后，将这些窗口中的每一个分类为一个字符。 Instead of Sliding Window, you could try isolating your desired features somehow and just classify those 28x28 segments instead.您可以尝试以某种方式隔离您想要的特征，而不是滑动窗口，而是对那些28x28段进行分类。
Group the predicted characters somehow so you get words (probably grouping those separated by empty spaces) or do whatever you want with the predictions.以某种方式对预测的字符进行分组，以便获得单词（可能将那些由空格分隔的单词分组）或对预测执行任何您想要的操作。

You can also try searching for tutorials or guides for Handwriting recognition like this one I have found quite useful.您还可以尝试搜索手写识别的教程或指南，就像我发现的这样一个非常有用。 Hope this helps you get on track, good luck.希望这能帮助你走上正轨，祝你好运。