简体   繁体   中英

Pytorch incorrect value of member variable when using Multi-gpu

Here is a simple class for running in multi-gpu environment. The member variable self.firstIter should be False after the first iteration.

Class TestNetwork(nn.Module):

    def __init__(self):
        super(TestNetwork, self).__init__()
        self.firstIter = True #indicates whether it's the first iteration

    def forward(self, input):
        print 'is firstIter: ', self.firstIter #always True!!
        if self.firstIter is True:
            self.firstIter = False
        # do otherthings

The code works as expected when using only one gpu.

However when using multi-gpu (ie nn.DataParallel ), the value of self.firstIter is always printed as True .

Why does this happen? What is wrong with the code?

Using PyTorch version 0.3.1.

Basically, DataParallel operates on model replicas, and changes made to replicas (during forward) are not visible outside forward/backward calls if number of divices is large than 1.

Plz refer to https://discuss.pytorch.org/t/nonetype-attribute-when-using-dataparallel/11566 for details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM