简体   繁体   English

在联邦学习中如何选择“数据”和“目标”? (PySyft)

[英]how "data" and "target" are choosen in a federated learning? (PySyft)

i can't understand how in function train() below, the variable (data, target) are choosen.我无法理解如何在下面的函数 train() 中选择变量(数据、目标)。

def train(args, model, device, federated_train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(federated_train_loader): # <-- now it is a distributed dataset
        model.send(data.location) # <-- NEW: send the model to the right location`

i guess they are 2 tensor representing 2 random images of dataset train, but then the loss function我猜它们是代表数据集训练的 2 个随机图像的 2 张量,但是损失函数

loss = F.nll_loss(output, target)

is calculated at every interaction with different target?在与不同目标的每次交互中计算?

Also i have different question: i trained the network with images of cats, then i test it with images of cars and the accuracy reached is 97%.我还有一个不同的问题:我用猫的图像训练了网络,然后用汽车的图像对其进行了测试,达到的准确率为 97%。 How is this possible?这怎么可能? is a proper value or i'm doing something wrong?是正确的值还是我做错了什么?

here is the entire code:这是整个代码:

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

import syft as sy  # <-- NEW: import the Pysyft library
hook = sy.TorchHook(torch)  # <-- NEW: hook PyTorch ie add extra functionalities to support Federated Learning
bob = sy.VirtualWorker(hook, id="bob")  # <-- NEW: define remote worker bob
alice = sy.VirtualWorker(hook, id="alice")  # <-- NEW: and alice

class Arguments():
    def __init__(self):
        self.batch_size = 64
        self.test_batch_size = 1000
        self.epochs = 2
        self.lr = 0.01
        self.momentum = 0.5
        self.no_cuda = False
        self.seed = 1
        self.log_interval = 30
        self.save_model = False

args = Arguments()

use_cuda = not args.no_cuda and torch.cuda.is_available()

torch.manual_seed(args.seed)

device = torch.device("cuda" if use_cuda else "cpu")

kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}

federated_train_loader = sy.FederatedDataLoader( # <-- this is now a FederatedDataLoader
    datasets.MNIST("C:\\users...\\train", train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ]))
    .federate((bob, alice)), # <-- NEW: we distribute the dataset across all the workers, it's now a FederatedDataset
    batch_size=args.batch_size, shuffle=True, **kwargs)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST("C:\\Users...\\test", train=False, download=True, transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),
    batch_size=args.test_batch_size, shuffle=True, **kwargs)


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4*4*50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4*4*50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

def train(args, model, device, federated_train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(federated_train_loader): # <-- now it is a distributed dataset
        model.send(data.location) # <-- NEW: send the model to the right location
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        model.get() # <-- NEW: get the model back
        if batch_idx % args.log_interval == 0:
            loss = loss.get() # <-- NEW: get the loss back
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * args.batch_size, len(federated_train_loader) * args.batch_size,
                100. * batch_idx / len(federated_train_loader), loss.item()))

def test(args, model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss
            pred = output.argmax(1, keepdim=True) # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))


model = Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=args.lr) # TODO momentum is not supported at the moment

for epoch in range(1, args.epochs + 1):
    train(args, model, device, federated_train_loader, optimizer, epoch)
    test(args, model, device, test_loader)

if (args.save_model):
    torch.save(model.state_dict(), "mnist_cnn.pt")   

Consider it like this.像这样考虑。 When you hook torch, all your torch tensors will get additional functionality - methods like .send() , .federate() , and attributes like .location and ._objects .当你钩火炬,所有的火把张量将获得额外的功能-类似的方法.send() .federate()和属性,如.location._objects Your data and target, which were once torch tensors, became pointers to tensors residing in different VirtualWorker objects due to .federate((bob, alice)) .由于.federate((bob, alice)) ,您的数据和目标(曾经是火炬张量)变成了指向驻留在不同VirtualWorker对象中的张量的指针。

Now data and target have additional attributes that includes .location , which will return the location of that tensor - data/target pointed by the pointer called data/target.现在 data 和 target 有额外的属性,包括.location ,它将返回张量的位置 - data/target 由名为 data/target 的指针指向。

Federated learning sends the global model to this location, as seen in model.send(data.location) .联邦学习将全局模型发送到该位置,如model.send(data.location)

Now, model is a pointer residing at the same location and data is also a pointer residing there.现在, model是驻留在同一位置的指针, data也是驻留在那里的指针。 Hence when you take the output as output = model(data) , output will also reside there and all we (the central server or in other words, the VirtualWorker called 'me' ) will get is a pointer to that output.因此,当您将输出作为output = model(data) ,输出也将驻留在那里,而我们(中央服务器或换句话说,称为'me'的 VirtualWorker)将获得一个指向该输出的指针。

Now, regarding your doubt on loss calculation, since output and target are both residing in that same location, calculation of loss will also happen there.现在,关于您对损失计算的怀疑,由于输出和目标都位于同一位置,因此loss计算也会发生在那里。 Same goes for backprop and step.反向传播和步骤也是如此。

Finally, you can see model.get() , here is where the central server pulls the remote model using the pointer called model .最后,您可以看到model.get() ,这里是中央服务器使用名为model的指针拉取远程模型的地方。 (I'm not sure if it should be model = model.get() though). (我不确定它是否应该是model = model.get() )。

So anything with .get() will be pulled from that worker and will be returned in our python statement.所以任何带有.get()东西都会从那个 worker 中拉出来,并在我们的 python 语句中返回。 Also note that .get() will remove that object from it's location when called.另请注意, .get()将在调用时从其位置删除该对象。 Hence use .copy().get() if you are going to need it further.因此,如果您需要进一步使用.copy().get()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pysyft 联合学习,Websockets 错误 - Pysyft Federated learning, Error with Websockets RuntimeError: input.size(-1) 必须等于 input_size。 预期 200,得到 0 ---- PySyft / PyTorch / 联合学习 - RuntimeError: input.size(-1) must be equal to input_size. Expected 200, got 0 ---- PySyft / PyTorch / Federated Learning 创建联邦学习数据 - Create federated learning data 在联邦学习中将数据拆分为训练和测试 - splitting the data into training and testing in federated learning 如何在联邦学习中使用迁移学习? - How can I use transfer learning in federated learning? 联合强化学习 - Federated reinforcement learning 如何使用 TensorFlow 联邦框架在有状态联邦学习中初始化客户端状态? - How to initialize clients' states in stateful Federated Learning, using the TensorFlow Federated framework? ResNet 上的 TensorFlow 联合学习失败 - Tensorflow Federated Learning on ResNet failse 如果我的目标答案位于行而不是列中,如何为监督学习实施训练数据? - How do I implement training data for Supervised Learning if my target answers are located in the rows instead of columns? 如何使用“联邦学习”根据客户数量将数据集拆分为训练和测试 - How to split the dataset into train and test based on client number using “Federated learning”
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM