使用 pytorch tensorDataset class 时出现“预期一维目标张量”错误

Question

I am wondering why this error is occuring.我想知道为什么会发生此错误。 My hunch tells me that the tensorDataset reads the last column as being the labels, but I don't know why it would behave that way if I input a separate dataset for labels as the second argument.我的直觉告诉我，tensorDataset 读取最后一列作为标签，但我不知道为什么如果我输入一个单独的标签数据集作为第二个参数，它为什么会这样。 Also, can someone explain exactly how one-hot encoding works and how I can fix this problem because I only want one label per item?另外，有人可以准确解释 one-hot 编码的工作原理以及我如何解决这个问题，因为我每个项目只想要一个 label 吗？

Error: return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1D target tensor expected, multi-target not supported

Code:代码：

if __name__ == '__main__':

inputs_file = pd.read_csv('dataset.csv')
targets_file = pd.read_csv('labels.csv')

inputs = inputs_file.iloc[1:1001].values
targets = targets_file.iloc[1:1001].values

inputs = torch.tensor(inputs, dtype=torch.float32)
targets = torch.tensor(targets)

dataset = TensorDataset(inputs, targets)

val_size = 200
test_size = 100
train_size = len(dataset) - (val_size + test_size)

# Divide dataset into 3 unique random subsets
training_data, validation_data, test_data = random_split(dataset, [train_size, val_size, test_size])

batch_size = 50

train_loader = DataLoader(training_data, batch_size, shuffle=True, num_workers=4, pin_memory=True)
valid_loader = DataLoader(validation_data, batch_size*2, num_workers=4, pin_memory=True)

Answer 1

From what I gather from the comments discussion, the error is reproduced by the following.根据我从评论讨论中收集到的信息，错误由以下内容重现。

import torch
from torch import nn
from torch.utils.data import DataLoader, TensorDataset, random_split

inputs = torch.randn(999, 11, dtype=torch.float32)
targets = torch.randint(5, (999, 1), dtype=torch.long)

# you need this to adapt from pandas, but not for this example code
# inputs = torch.tensor(inputs, dtype=torch.float32)
# targets = torch.tensor(targets)

dataset = TensorDataset(inputs, targets)

val_size = 200
test_size = 100
train_size = len(dataset) - (val_size + test_size)

# Divide dataset into 3 unique random subsets
training_data, validation_data, test_data = random_split(dataset, [train_size, val_size, test_size])

batch_size = 50

train_loader = DataLoader(training_data, batch_size, shuffle=True, num_workers=4, pin_memory=True)
valid_loader = DataLoader(validation_data, batch_size*2, num_workers=4, pin_memory=True)

# guess model. More on this in a moment
model = nn.Sequential(
    nn.Linear(11, 8),
    nn.Linear(8, 5),
)

loss_func = nn.CrossEntropyLoss()

for features, labels in train_loader:
    out = model(features)
    loss = loss_func(out, labels)
    print(f"{loss = }")
    break

Solution 1解决方案 1

Add labels.squeeze(-1) to the loop body a la将labels.squeeze(-1)添加到循环体 a la

for features, labels in train_loader:
    out = model(features)
    labels = labels.squeeze()
    loss = loss_func(out, labels)
    print(f"{loss = }")
    break

Solution 2方案二

Flatten your targets initially with最初将您的目标展平

targets = torch.tensor(targets[:, 0])

In response to回应

Now I am getting this error: RuntimeError: mat1 and mat2 shapes cannot be multiplied (11x1 and 11x8) I should also add that I am using a hidden layer of size 8 and i have 5 classes现在我收到这个错误：RuntimeError: mat1 and mat2 shapes cannot be multiplied (11x1 and 11x8) 我还应该补充一点，我使用的是大小为 8 的隐藏层，我有 5 个类

My architecture is a guess at what you're using, but as the code above is resolved by the target reshape, I'll need more to be more helpful.我的架构是对您正在使用的内容的猜测，但由于上面的代码已通过目标重塑解决，我需要更多才能提供更多帮助。

Perhaps some documentation to assist?也许有一些文件可以提供帮助？ CrossEntropyLoss The example code shows the expected shape of the targets being N , rather than N, 1 or N, classes . CrossEntropyLoss示例代码显示目标的预期形状是N ，而不是N, 1或N, classes 。

使用 pytorch tensorDataset class 时出现“预期一维目标张量”错误

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-10-11 19:52:01

Solution 1解决方案 1

Solution 2方案二

使用 pytorch tensorDataset class 时出现“预期一维目标张量”错误

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-10-11 19:52:01

Solution 1解决方案 1

Solution 2方案二

解决方案1
0 已采纳 2021-10-11 19:52:01