关于Pytorch中的张量尺寸和批量大小感到困惑

Question

So I'm very new to PyTorch and Neural Networks in general, and I'm having some problems creating a Neural Network that classifies names by gender. 因此，我通常对PyTorch和神经网络都不熟悉，因此在创建按性别对姓名进行分类的神经网络时遇到了一些问题。
I based this off of the PyTorch tutorial for RNNs that classify names by nationality, but I decided not to go with a recurrent approach... Stop me right here if this was the wrong idea! 我基于PyTorch的RNN教程，该教程按国籍对姓名进行分类，但是我决定不采用重复使用的方法... 如果这是错误的主意，请在这里停下来！
However, whenever I try to run an input through the network it tells me: 但是，每当我尝试通过网络运行输入时，都会告诉我：

RuntimeError: matrices expected, got 3D, 2D tensors at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/generic/THTensorMath.c:1232

I know this has something to do with how PyTorch always expects there to be a batch size or something, and I have my tensor set up that way, but you can probably tell by this point that I have no idea what I'm talking about. 我知道这与PyTorch总是希望有一个批处理大小之类的东西有关，并且我以这种方式设置了张量，但是到现在为止，您可能可以告诉我我不知道我在说什么。 Here's my code: 这是我的代码：

from future import unicode_literals, print_function, division
from io import open
import glob
import unicodedata
import string
import torch
import torchvision
import torch.nn as nn
import torch.optim as optim
import random
from torch.autograd import Variable
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

"""------GLOBAL VARIABLES------"""

all_letters = string.ascii_letters + " .,;'"
num_letters = len(all_letters)
all_names = {}
genders = ["Female", "Male"]

"""-------DATA EXTRACTION------"""

def findFiles(path):
    return glob.glob(path)

def unicodeToAscii(s):
    return ''.join(
        c for c in unicodedata.normalize('NFD', s)
        if unicodedata.category(c) != 'Mn'
        and c in all_letters
    )

# Read a file and split into lines
def readLines(filename):
    lines = open(filename, encoding='utf-8').read().strip().split('\n')
    return [unicodeToAscii(line) for line in lines]

for file in findFiles("/home/andrew/PyCharm/PycharmProjects/CantStop/data/names/*.txt"):
    gender = file.split("/")[-1].split(".")[0]
    names = readLines(file)
    all_names[gender] = names

"""-----DATA INTERPRETATION-----"""

def nameToTensor(name):
    tensor = torch.zeros(len(name), 1, num_letters)
    for index, letter in enumerate(name):
        tensor[index][0][all_letters.find(letter)] = 1
    return tensor

def outputToGender(output):
    gender, gender_index = output.data.topk(1)
    if gender_index[0][0] == 0:
        return "Female"
    return "Male"

"""------NETWORK SETUP------"""

class Net(nn.Module):
    def __init__(self, input_size, output_size):
        super(Net, self).__init__()
        #Layer 1
        self.Lin1 = nn.Linear(input_size, int(input_size/2))
        self.ReLu1 = nn.ReLU()
        self.Batch1 = nn.BatchNorm1d(int(input_size/2))
        #Layer 2
        self.Lin2 = nn.Linear(int(input_size/2), output_size)
        self.ReLu2 = nn.ReLU()
        self.Batch2 = nn.BatchNorm1d(output_size)
        self.softMax = nn.LogSoftmax()

    def forward(self, input):
        output1 = self.Batch1(self.ReLu1(self.Lin1(input)))
        output2 = self.softMax(self.Batch2(self.ReLu2(self.Lin2(output1))))
        return output2

NN = Net(num_letters, 2)

"""------TRAINING------"""

def getRandomTrainingEx():
    gender = genders[random.randint(0, 1)]
    name = all_names[gender][random.randint(0, len(all_names[gender])-1)]
    gender_tensor = Variable(torch.LongTensor([genders.index(gender)]))
    name_tensor = Variable(nameToTensor(name))
    return gender_tensor, name_tensor, gender

def train(input, target):
    loss_func = nn.NLLLoss()

    optimizer = optim.SGD(NN.parameters(), lr=0.0001, momentum=0.9)

    optimizer.zero_grad()

    output = NN(input)

    loss = loss_func(output, target)
    loss.backward()
    optimizer.step()

    return output, loss

all_losses = []
current_loss = 0

for i in range(100000):
    gender_tensor, name_tensor, gender = getRandomTrainingEx()
    output, loss = train(name_tensor, gender_tensor)
    current_loss += loss

    if i%1000 == 0:
        print("Guess: %s, Correct: %s, Loss: %s" % (outputToGender(output), gender, loss.data[0]))

    if i%100 == 0:
        all_losses.append(current_loss/10)
        current_loss = 0

# plt.figure()
# plt.plot(all_losses)
# plt.show()

Please help a newbie out! 请帮助新手！

Answer 1

Debugging your bug out: 调试错误：

Pycharm is a helpful python debugger that let you set breakpoint and views dimension of your tensor. Pycharm是一个有用的python调试器，可让您设置张量的断点和视图尺寸。
For easier debug, do not stack forward thing up like that 为了简化调试，请勿像这样堆叠内容

output1 = self.Batch1(self.ReLu1(self.Lin1(input)))

Instead, 代替，

h1 = self.ReLu1(self.Lin1(input))
h2 = self.Batch1(h1)

For the stacktrace, Pytorch also provide Pythonic error stacktrack. 对于堆栈跟踪，Pytorch还提供Pythonic错误堆栈跟踪。 I believe that before 我相信以前

RuntimeError: matrices expected, got 3D, 2D tensors at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/generic/THTensorMath.c:1232

There are some python error stacktrace that point right into your code. 有一些python错误stacktrace指向您的代码。 For easier debug, as I said, don't stack forward. 如前所述，为了简化调试，请不要向前堆叠。

You use Pycharm to create break point before crash point. 您可以使用Pycharm 在崩溃点之前创建断点。 In debugger watcher Then use Variable(torch.rand(dim1, dim2)) to test out forward pass input, output dimension, and if a dimension is incorrect. 在调试器监视程序中，然后使用Variable(torch.rand(dim1, dim2))测试前向输入，输出尺寸以及尺寸是否不正确。 Comparing with dimension of input. 与输入维度进行比较。 Call input.size() in debugger watcher. 在调试器监视程序中调用input.size() 。

For example, self.ReLu1(self.Lin1(Variable(torch.rand(10, 20)))).size() . 例如self.ReLu1(self.Lin1(Variable(torch.rand(10, 20)))).size() 。 If it show read text (error), then the input dimension is incorrect. 如果显示读取的文本（错误），则输入尺寸不正确。 Else, it show the size of the output. 否则，它显示输出的大小。

Read the docs 阅读文档

In Pytorch Docs , it specify input/output dimension. 在Pytorch Docs中，它指定输入/输出尺寸。 It also have a example code snip 它还有一个示例代码片段

>>> rnn = nn.RNN(10, 20, 2)
>>> input = Variable(torch.randn(5, 3, 10))
>>> h0 = Variable(torch.randn(2, 3, 20))
>>> output, hn = rnn(input, h0)

You may use the code snip in PyCharm Debugger to explore dimension of input, output of specific layer of your interest (RNN, Linear, BatchNorm1d). 您可以使用PyCharm Debugger中的代码片段来浏览您感兴趣的特定层（RNN，Linear，BatchNorm1d）的输入，输出维度。

Answer 2

First, regarding your error, as other answers say and also your exception, it is probably because your input parameters are not shaped correctly. 首先，关于您的错误，正如其他答案所说的那样，也是您的例外情况，这可能是因为您的输入参数的形状不正确。 You could try debugging to isolate the line that gives the error, and then edit your question with it, so we know for sure what is causing the problem and correct it (without full stack trace it is harder to know what is the problem). 您可以尝试调试以找出导致错误的行，然后使用该错误进行编辑，因此我们可以确定是什么原因导致了该问题并进行纠正（如果没有完整的堆栈跟踪，就很难知道是什么问题了）。

Now, you are trying to implement a Neural Network that classifies names by gender , as you indicated. 现在，您正在尝试实现一个神经网络 ，按照您的指示按性别对姓名进行分类 。 We can see that this task will require to somehow input a name (which have different sizes) and output a gender (a binary variable: male, female). 我们可以看到，此任务将需要以某种方式输入名称（大小不同）并输出性别（二进制变量：男性，女性）。 However, Neural Networks in general are built and trained to classify inputs (vectors) of fixed size of features, like they mention in the pytorch docs : 但是，一般来说，神经网络的构建和训练是对特征大小固定的输入（向量）进行分类，就像pytorch 文档中提到的那样：

Parameters: input_size – The number of expected features in the input x 参数：input_size –输入x中预期要素的数量

... ...

Looking at the tutorial you mentioned, they do consider this situation, as in their case the input for the network is a single letter transformed to a "one-hot vector", as they indicate: 在您提到的教程中，他们确实考虑到了这种情况，因为在这种情况下，网络的输入是由单个字母转换为“单热向量”的，因为它们表明：

To run a step of this network we need to pass an input (in our case, the Tensor for the current letter ) and a previous hidden state (which we initialize as zeros at first). 要运行此网络的步骤，我们需要传递输入（在本例中为当前字母的Tensor）和先前的隐藏状态（我们首先将其初始化为零）。 We'll get back the output (probability of each language) and a next hidden state (which we keep for the next step). 我们将返回输出（每种语言的概率）和下一个隐藏状态（我们将其保留用于下一步）。

And even give an example of it (remember tensors are Variable s in pytorch): 甚至给出一个例子（记住pytorch中的张量是Variable ）：

input = Variable(letterToTensor('A'))
hidden = Variable(torch.zeros(1, n_hidden))
output, next_hidden = rnn(input, hidden)

Note: That being said, there are some other things you can do to adapt your implementation to variable-sized inputs. 注意：话虽这么说，您还可以做其他一些事情来使实现适应可变大小的输入。 Based on my experience and also complemented by this and this other great questions, you could: 根据我的经验，再加上此问题以及其他重要问题，您可以：

Preprocess your data to extract new features and transform it to fixed-size inputs. 预处理数据以提取新功能并将其转换为固定大小的输入。 This is usually the most used approach but requires experience and patience to get good features. 这通常是最常用的方法，但是需要经验和耐心才能获得良好的功能。 Some techniques used are PCA (Principal Component Analysis) and LDA (Latent Dirichlet Allocation) 使用的一些技术是PCA（主成分分析）和LDA（潜在狄利克雷分配）
For example, you could extract from your data features like: the length of the name, the number of letter a's in the name (female names tend to have more a's), the number of letter e's in the name (the same but with male names maybe?), and others... so you can generate new features like [name_length, a_found, e_found, ...] . 例如，您可以从数据特征中提取以下内容：名称的长度，名称中字母a的数量（女性名称中通常有更多a），名称中字母e的数量（相同但男名称？），以及其他...等等，因此您可以生成新功能，例如[name_length, a_found, e_found, ...] 。 Then you could follow a regular approach with you new fixed-size vectors. 然后，您可以按照常规方法处理新的固定大小的向量。 Do note that those features have to be meaningful; 请注意，这些功能必须有意义。 these ones I just came up for example (although they could work). 例如，这些只是我想出来的（尽管它们可以工作）。
Split your input names into fixed-sized substring (or iterate them with a sliding window), so then you can classify them with a network designed for that size and combine the outputs in an ensemble way to obtain the final classification. 将输入名称拆分为固定大小的子字符串（或使用滑动窗口对其进行迭代），以便随后可以使用针对该大小设计的网络对它们进行分类，并以集成的方式组合输出以获得最终分类。

关于Pytorch中的张量尺寸和批量大小感到困惑

问题描述

2 个解决方案

解决方案1
2 2017-07-11 02:36:40

解决方案2
0 2017-07-13 16:47:28

关于Pytorch中的张量尺寸和批量大小感到困惑

问题描述

2 个解决方案

解决方案1 2 2017-07-11 02:36:40

解决方案2 0 2017-07-13 16:47:28

解决方案1
2 2017-07-11 02:36:40

解决方案2
0 2017-07-13 16:47:28