简体   繁体   English

张量流线性回归的pytorch等价物是什么?

[英]what is the pytorch equivalent of a tensorflow linear regression?

I am learning pytorch, that to do a basic linear regression on this data created this way here:我正在学习 pytorch,要对这里以这种方式创建的数据进行基本的线性回归:

from sklearn.datasets import make_regression

x, y = make_regression(n_samples=100, n_features=1, noise=15, random_state=42)
y = y.reshape(-1, 1)
print(x.shape, y.shape)

plt.scatter(x, y)

I know that using tensorflow this code can solve:我知道使用 tensorflow 这段代码可以解决:

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=1, activation='linear', input_shape=(x.shape[1], )))

model.compile(optimizer=tf.keras.optimizers.SGD(lr=0.05), loss='mse')

hist = model.fit(x, y, epochs=15, verbose=0)

but I need to know what the pytorch equivalent would be like, what I tried to do was this:但我需要知道 pytorch 等价物会是什么样子,我试图做的是:

# Model Class
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.linear = nn.Linear(1,1)
        
    def forward(self, x):
        x = self.linear(x)
        return x
    
    def predict(self, x):
        return self.forward(x)
    
model = Net()

loss_fn = F.mse_loss
opt = torch.optim.SGD(modelo.parameters(), lr=0.05)

# Funcao para treinar
def fit(num_epochs, model, loss_fn, opt, train_dl):
    
    
    # Repeat for given number of epochs
    for epoch in range(num_epochs):
        
        # Train with batches of data
        for xb, yb in train_dl:
            
            # 1. Generate predictions
            pred = model(xb)
            
            # 2. Calculate Loss
            loss = loss_fn(pred, yb)
            
            # 3. Campute gradients
            loss.backward()
            
            # 4. Update parameters using gradients
            opt.step()
            
            # 5. Reset the gradients to zero
            opt.zero_grad()
            
        # Print the progress
        if (epoch+1) % 10 == 0:
            print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item()))

# Training
fit(200, model, loss_fn, opt, data_loader)

But the model doesn't learn anything, I don't know what I can do anymore.但是模型没有学到任何东西,我不知道我还能做什么。

The input/output dimensions is (1/1)输入/输出尺寸为 (1/1)

Dataset数据集

First of all, you should define torch.utils.data.Dataset首先,你应该定义torch.utils.data.Dataset

import torch
from sklearn.datasets import make_regression


class RegressionDataset(torch.utils.data.Dataset):
    def __init__(self):
        data = make_regression(n_samples=100, n_features=1, noise=0.1, random_state=42)
        self.x = torch.from_numpy(data[0]).float()
        self.y = torch.from_numpy(data[1]).float()

    def __len__(self):
        return len(self.x)

    def __getitem__(self, index):
        return self.x[index], self.y[index]

It converts numpy data to PyTorch's tensor inside __init__ and converts data to float ( numpy has double by default while PyTorch's default is float in order to use less memory).它可以转换numpy数据PyTorch的tensor__init__和将数据转换为floatnumpy具有double默认而PyTorch的默认值是float ,以使用更少的内存)。

Apart from that it will simply return tuple of features and respective regression targets.除此之外,它将简单地返回特征tuple和各自的回归目标。

Fit合身

Almost there, but you have to flatten output from the model (described below).差不多了,但是您必须使模型的输出变平(如下所述)。 torch.nn.Linear will return tensors of shape (batch, 1) while your targets are of shape (batch,) . torch.nn.Linear将返回形状为(batch, 1)张量,而您的目标的形状为(batch,) flatten() will remove unnecessary 1 dimension. flatten()将删除不必要的1维。

# 2. Calculate Loss
loss = criterion(pred.flatten(), yb)

Model模型

That is all you need actually:这就是你真正需要的:

model = torch.nn.Linear(1, 1)

Any layer can be called directly, no need for forward and inheritance for simple models.任何层都可以直接调用,简单模型不需要forward和继承。

Calling打电话

The rest is almost okay, you just have to create torch.utils.data.DataLoader and pass instance of our dataset.剩下的几乎没问题,你只需要创建torch.utils.data.DataLoader并传递我们数据集的实例。 What DataLoader does is it issues __getitem__ of dataset multiple times and creates a batch of specified size (there is some other funny business, but that's the idea): DataLoader作用是多次发出dataset __getitem__并创建一批指定大小的(还有一些其他有趣的事情,但这就是想法):

dataset = RegressionDataset()
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
model = torch.nn.Linear(1, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=3e-4)

fit(5000, model, criterion, optimizer, dataloader)

Also notice I've used torch.nn.MSELoss() , as we are passing object it looks better than function in this case.另请注意,我使用了torch.nn.MSELoss() ,因为我们正在传递对象,在这种情况下它看起来比函数更好。

Whole code全码

To make it easier:为了使它更容易:

import torch
from sklearn.datasets import make_regression


class RegressionDataset(torch.utils.data.Dataset):
    def __init__(self):
        data = make_regression(n_samples=100, n_features=1, noise=0.1, random_state=42)
        self.x = torch.from_numpy(data[0]).float()
        self.y = torch.from_numpy(data[1]).float()

    def __len__(self):
        return len(self.x)

    def __getitem__(self, index):
        return self.x[index], self.y[index]


# Funcao para treinar
def fit(num_epochs, model, criterion, optimizer, train_dl):
    # Repeat for given number of epochs
    for epoch in range(num_epochs):

        # Train with batches of data
        for xb, yb in train_dl:

            # 1. Generate predictions
            pred = model(xb)

            # 2. Calculate Loss
            loss = criterion(pred.flatten(), yb)

            # 3. Compute gradients
            loss.backward()

            # 4. Update parameters using gradients
            optimizer.step()

            # 5. Reset the gradients to zero
            optimizer.zero_grad()

        # Print the progress
        if (epoch + 1) % 10 == 0:
            print(
                "Epoch [{}/{}], Loss: {:.4f}".format(epoch + 1, num_epochs, loss.item())
            )


dataset = RegressionDataset()
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
model = torch.nn.Linear(1, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=3e-4)

fit(5000, model, criterion, optimizer, dataloader)

You should get around 0.053 loss or so, vary noise or other params for harder/easier regression task.你应该得到大约0.053损失,改变noise或其他参数以实现更难/更容易的回归任务。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM