简体   繁体   English

在 python 多处理管理器命名空间中,为什么不能直接分配

[英]in python multiprocessing manager namespace, why cant assign directly

Can anyone do a favor to figure out why we cant change dataframe directly?谁能帮忙弄清楚为什么我们不能直接更改数据框? add_new_derived_column_NOT_work dont work as I expected add_new_derived_column_NOT_work 不像我预期的那样工作

#-*- coding: UTF-8 -*-'
import pandas as pd
import numpy as np
from multiprocessing import *
import multiprocessing.sharedctypes as sharedctypes
import ctypes

def add_new_derived_column_work(ns):
    dataframe2 = ns.df
    dataframe2['new_column']=dataframe2['A']+dataframe2['B'] / 2
    print (dataframe2.head())
    ns.df = dataframe2

def add_new_derived_column_NOT_work(ns):
    ns.df['new_column']=ns.df['A']+ns.df['B'] / 2
    print (ns.df.head())

if __name__ == "__main__":

    mgr = Manager()
    ns = mgr.Namespace()

    dataframe = pd.DataFrame(np.random.randn(100000, 2), columns=['A', 'B'])
    ns.df = dataframe
    print (dataframe.head())

    # then I pass the "shared_df_obj" to Mulitiprocessing.Process object
    process=Process(target=add_new_derived_column_work, args=(ns,))
    process.start()
    process.join()

    print (ns.df.head())

As per Python documentation:根据 Python 文档:

  • a namespace object has "writable attributes" 命名空间对象具有“可写属性”

  • but about proxy objects , you can read that "If standard (non-proxy) list or dict objects are contained in a referent, modifications to those mutable values will not be propagated through the manager because the proxy has no way of knowing when the values contained within are modified. However, storing a value in a container proxy does propagate through the manager and so to effectively modify such an item"但是关于代理对象,您可以阅读“如果标准(非代理)列表或字典对象包含在引用中,则对这些可变值的修改将不会通过管理器传播,因为代理无法知道值何时包含在其中被修改。但是,在容器代理中存储值确实会通过管理器传播,因此可以有效地修改此类项目“

This why, I think, add_new_derived_column_work works, as it assign a new dataframe to ns.df, whereas add_new_derived_column_NOT_work fails, as it tries to add a column to the dataframe by mutating it, but this mutation does not actualy affect ns.df.这就是为什么,我认为add_new_derived_column_work有效,因为它为 ns.df分配了一个新的数据帧,而add_new_derived_column_NOT_work失败,因为它试图通过改变它向数据帧添加一列,但这种突变实际上并不影响 ns.df。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM