简体   繁体   中英

Python: Dynamically add properties to class instance, properties return function value with inputs

I've been going through all the Stackoverflow answers on dynamic property setting, but for whatever reason I can't seem to get this to work.

I have a class, Evolution_Base , that in its init creates an instance of Value_Differences . Value_Differences should be dynamically creating properties , based on the list I pass, that returns the function value from _get_df_change :

from pandas import DataFrame
from dataclasses import dataclass
import pandas as pd
class Evolution_Base():
    
    def __init__(self, res_date_0 : DataFrame , res_date_1 : DataFrame):
        
        @dataclass
        class Results_Data():          
            res_date_0_df : DataFrame               
            res_date_1_df : DataFrame
            
    
        self.res = Results_Data(res_date_0_df= res_date_0,
                                res_date_1_df= res_date_1)
        
        property_list = ['abc', 'xyz']
        self.difference = Value_Differences(parent = self, property_list=property_list)
        
    
    # Shared Functions
    def _get_df_change(self, df_name, operator = '-'):
        df_0 = getattr(self.res.res_date_0_df, df_name.lower())
        df_1 = getattr(self.res.res_date_1_df, df_name.lower())
        return self._df_change(df_1, df_0, operator=operator)
        
    def _df_change(self, df_1 : pd.DataFrame, df_0 : pd.DataFrame, operator = '-') -> pd.DataFrame:
        """
        Returns df_1 <operator | default = -> df_0
        """        
        # is_numeric mask
        m_1 = df_1.select_dtypes('number')
        m_0 = df_0.select_dtypes('number')
        
        def label_me(x):
            x.columns = ['t_1', 't_0']
            return x
        
        if operator == '-':
            return label_me(df_1[m_1] - df_0[m_0])
        elif operator == '+':
            return label_me(df_1[m_1] + df_0[m_0])
        
        
class Value_Differences():    
    def __init__(self, parent : Evolution_Base, property_list = []):
        self._parent = parent
    
        for name in property_list:
                        
            def func(self, prop_name):
                return self._parent._get_df_change(name)
            
            # I've tried the following... 
            setattr(self, name, property(fget = lambda cls_self: func(cls_self, name)))
            setattr(self, name, property(func(self, name)))
            setattr(self, name, property(func))

Its driving me nuts... Any help appreciated!

My desired outcome is for:

evolution = Evolution_Base(df_1, df_2)
evolution.difference.abc == evolution._df_change('abc')
evolution.difference.xyz == evolution._df_change('xyz')

EDIT: The simple question is really, how do I setattr for a property function ?

As asked

how do I setattr for a property function ?

To be usable as a property , the accessor function needs to be wrapped as a property and then assigned as an attribute of the class, not the instance.

That function, meanwhile, needs to have a single unbound parameter - which will be an instance of the class, but is not necessarily the current self . Its logic needs to use the current value of name , but late binding will be an issue because of the desire to create lambdas in a loop .

A clear and simple way to work around this is to define a helper function accepting the Value_Differences instance and the name to use, and then bind the name value eagerly.

Naively:

from functools import partial

def _get_from_parent(name, instance):
    return instance._parent._get_df_change(name)

class Value_Differences:    
    def __init__(self, parent: Evolution_Base, property_list = []):
        self._parent = parent
    
        for name in property_list:            
            setattr(Value_Differences, name, property(
                fget = partial(_get_from_parent, name)
            ))

However, this of course has the issue that every instance of Value_Differences will set properties on the class , thus modifying what properties are available for each other instance . Further, in the case where there are many instances that should have the same properties, the setup work will be repeated at each instance creation .


The apparent goal

It seems that what is really sought, is the ability to create classes dynamically , such that a list of property names is provided and a corresponding class pops into existence, with code filled in for the properties implementing a certain logic.

There are multiple approaches to this.

Factory A: Adding properties to an instantiated template

Just like how functions can be nested within each other and the inner function will be an object that can be modified and return ed (as is common when creating a decorator ), a class body can appear within a function and a new class object (with the same name) is created every time the function runs. (The code in the OP already does this, for the Results_Data dataclass.)

def example():
    class Template:
        pass
    return Template

>>> TemplateA, TemplateB = example(), example()
>>> TemplateA is TemplateB
False
>>> isinstance(TemplateA(), TemplateB)
False
>>> isinstance(TemplateB(), TemplateA)
False

So, a "factory" for value-difference classes could look like

from functools import partial

def _make_value_comparer(property_names, access_func):
    class ValueDifferences:
        def __init__(self, parent):
            self._parent = parent
    for name in property_names:
        setattr(Value_Differences, name, property(
            fget = partial(access_func, name)
        ))
    return ValueDifferences

Notice that instead of hard-coding a helper, this factory expects to be provided with a function that implements the access logic. That function takes two parameters: a property name, and the ValueDifferences instance. (They're in that order because it's more convenient for functools.partial usage.)

Factory B: Using the type constructor directly

The built-in type in Python has two entirely separate functions.

With one argument, it discloses the type of an object. With three arguments, it creates a new type. The class syntax is in fact syntactic sugar for a call to this builtin. The arguments are:

  • a string name (will be set as the __name__ attribute)
  • a list of classes to use as superclasses (will be set as __bases__ )
  • a dict mapping attribute names to their values (including methods and properties - will become the __dict__ , roughly)

In this style, the same factory could look something like:

from functools import partial

def _make_value_comparer(property_names, access_func):
    methods = {
        name: property(fget = partial(access_func, name)
        for name in property_names
    }
    methods['__init__'] = lambda self, parent: setattr(self, '_parent', parent)
    return type('ValueDifferences', [], methods)

Using the factory

In either of the above cases, EvolutionBase would be modified in the same way.

Presumably, every EvolutionBase should use the same ValueDifferences class (ie, the one that specifically defines abc and xyz properties), so the EvolutionBase class can cache that class as a class attribute, and use it later:

class Evolution_Base():
    def _get_from_parent(name, mvd):
        # mvd._parent will be an instance of Evolution_Base.
        return mvd._parent._get_df_change(name)

    _MyValueDifferences = _make_value_comparer(['abc', 'xyz'], _get_from_parent)

    def __init__(self, res_date_0 : DataFrame , res_date_1 : DataFrame):        
        @dataclass
        class Results_Data():          
            res_date_0_df : DataFrame               
            res_date_1_df : DataFrame
    
        self.res = Results_Data(res_date_0_df= res_date_0,
                                res_date_1_df= res_date_1)
        
        self.difference = _MyValueDifferences(parent = self)

Notice that the cached _MyValueDifferences class no longer requires a list of property names to be constructed. That's because it was already provided when the class was created. The actual thing that varies per instance of _MyValueDifferences , is the parent , so that's all that gets passed.


Simpler approaches

It seems that the goal is to have a class whose instances are tightly associated with instances of Evolution_Base , providing properties specifically named abc and xyz that are computed using the Evolution_Base 's data.

That could just be hard-coded as a nested class:

class Evolution_Base:
    class EBValueDifferences:
        def __init__(self, parent):
            self._parent = parent

        @property
        def abc(self):
            return self._parent._get_df_change('abc')

        @property
        def xyz(self):
            return self._parent._get_df_change('xyz')

    def __init__(self, res_date_0 : DataFrame , res_date_1 : DataFrame):        
        @dataclass
        class Results_Data():          
            res_date_0_df : DataFrame               
            res_date_1_df : DataFrame
        self.res = Results_Data(res_date_0_df = res_date_0,
                                res_date_1_df = res_date_1)
        self.difference = EBValueDifferences(self)

    # _get_df_change etc. as before

Even simpler, provide corresponding properties directly on Evolution_Base :

class Evolution_Base:
    @property
    def abc_difference(self):
        return self._get_df_change('abc')

    @property
    def xyz_difference(self):
        return self._get_df_change('xyz')

    def __init__(self, res_date_0 : DataFrame , res_date_1 : DataFrame):        
        @dataclass
        class Results_Data():          
            res_date_0_df : DataFrame               
            res_date_1_df : DataFrame
        self.res = Results_Data(res_date_0_df = res_date_0,
                                res_date_1_df = res_date_1)

    # _get_df_change etc. as before

# client code now calls my_evolution_base.abc_difference
# instead of my_evolution_base.difference.abc

If there are a lot of such properties, they could be attached using a much simpler dynamic approach (that would still be reusable for other classes that define a _get_df_change ):

def add_df_change_property(name, cls):
    setattr(
        cls, f'{name}_difference',
        property(fget = lambda instance: instance._get_df_change(name))
    )

which can also be adapted for use as a decorator:

from functools import partial

def exposes_df_change(name):
    return partial(add_df_change_property, name)

@exposes_df_change('abc')
@exposes_df_change('def')
class Evolution_Base:
    # `self.difference` can be removed, no other changes needed

The fundamental reason why what you tried doesn't work is that a property by design must be stored as a class variable, not as an instance attribute.

Excerpt from the documentation of descriptor :

To use the descriptor, it must be stored as a class variable in another class:

To create a class with dynamically named properties that has access to a parent class, one elegant approach is to create the class within a method of the main class, and use setattr to create class attributes with dynamic names and property objects. A class created in the closure of a method automatically has access to the self object of the parent instance, avoiding having to manage a clunky _parent attribute like you do in your attempt:

class Evolution_Base:
    def __init__(self, property_list):
        self.property_list = property_list
        self._difference = None

    @property
    def difference(self):
        if not self._difference:
            class Value_Differences:
                pass
            for name in self.property_list:
                # use default value to store the value of name in each iteration
                def func(obj, prop_name=name):
                    return self._get_df_change(prop_name) # access self via closure
                setattr(Value_Differences, name, property(func))
            self._difference = Value_Differences()
        return self._difference

    def _get_df_change(self, df_name):
        return f'df change of {df_name}' # simplified return value for demo purposes

so that:

evolution = Evolution_Base(['abc', 'xyz'])
print(evolution.difference.abc)
print(evolution.difference.xyz)

would output:

df change of abc
df change of xyz

Demo: https://replit.com/@blhsing/ExtralargeNaturalCoordinate

Responding directly to your question, you can create a class:

class FooBar:
    def __init__(self, props):
        def make_prop(name):
            return property(lambda accessor_self: self._prop_impl(name))

        self.accessor = type(
            'Accessor',
            tuple(),
            {p: make_prop(p) for p in props}
        )()

    def _prop_impl(self, arg):
        return arg


o = FooBar(['foo', 'bar'])

assert o.accessor.foo == o._prop_impl('foo')
assert o.accessor.bar == o._prop_impl('bar')

Further, it would be beneficiary to cache created class to make equivalent objects more similar and eliminate potential issues with equality comparison.

That said, I am not sure if this is desired. There's little benefit of replacing method call syntax ( of('a') ) with property access ( oa ). I believe it can be detrimental on multiple accounts: dynamic properties are confusing, harder to document, etc., finally while none of this is strictly guaranteed in crazy world of dynamic python -- they kind of communicate wrong message: that the access is cheap and does not involve computation and that perhaps you can attempt to write to it.

(Deleted almost working example of what you wanted - it doesn't look like it would be possible to do it that way after all - left the rest since future readers of the question may want to see there is an alternative method)

Python doesn't seem to let you build templated functions in exactly the way you would need, but you can override the functions that give you the list of supported attributes and define a generic function to retrieve them:

class Value_Differences():
    def __init__(self, parent : Evolution_Base, property_list = []):
        self._parent = parent
        self._property_list = property_list

    def __dir__(self):
        return sorted(set(
               dir(super(Value_Differences, self)) + \
               list(self.__dict__.keys()) + self._property_list))

    def __getattr__(self, __name: str):
        if __name in self._property_list:
            return self._parent._get_df_change(__name)

In case it makes a difference, I couldn't get your EvolutionBase class methods to run on my test DataFrame instances, so I've been testing against a simplified:

class Evolution_Base():
    
    def __init__(self, res_date_0 : DataFrame , res_date_1 : DataFrame):
        
        @dataclass
        class Results_Data():          
            res_date_0_df : DataFrame               
            res_date_1_df : DataFrame

        self.res = Results_Data(res_date_0_df= res_date_0,
                                res_date_1_df= res_date_1)

        property_list = ['abc', 'xyz']
        self.difference = Value_Differences(parent = self, property_list=property_list)
    
    # Shared Functions
    def _get_df_change(self, df_name, operator = '-'):
        df_0 = getattr(self.res.res_date_0_df, df_name.lower())
        df_1 = getattr(self.res.res_date_1_df, df_name.lower())
        if operator == '-':
            return df_1 - df_0
        elif operator == '+':
            return df_1 + df_0

I think that when you define the function func in the loop, it closes over the current value of the name variable, not the value of the name variable at the time the property is accessed. To fix this, you can use a lambda function to create a closure that captures the value of name at the time the property is defined.

class Value_Differences():    
    def __init__(self, parent : Evolution_Base, property_list = []):
        self._parent = parent
    
        for name in property_list:
                        
            setattr(self, name, property(fget = lambda self, name=name: self._parent._get_df_change(name)))

Does this help you?

The simple question is really, how do I setattr for a property function?

In python we can set dynamic attributes like this:

class DynamicProperties():
    def __init__(self, property_list):
        self.property_list = property_list
    def add_properties(self):
        for name in self.property_list:
             setattr(self.__class__, name, property(fget=lambda self: 1))
            
dync = DynamicProperties(['a', 'b'])
dync.add_properties()
print(dync.a) # prints 1
print(dync.b) # prints 1 


Correct me if I am wrong but from reviewing your code, you want to create a dynamic attributes then set their value to a specific function call within the same class, where the passed in data is passed in attributes in the constructor " init " this is achievable, an example:

class DynamicProperties():
    def __init__(self, property_list, data1, data2):
        self.property_list = property_list
        self.data1 = data1
        self.data2 = data2
    def add_properties(self):
        for name in self.property_list:
             setattr(self.__class__, name, property(fget=lambda self: self.change(self.data1, self.data2) ))
            
    def change(self, data1, data2):
        return data1 - data2
        
        
dync = DynamicProperties(['a', 'b'], 1, 2)
dync.add_properties()
print(dync.a == dync.change(1, 2)) # prints true 
print(dync.b == dync.change(1,2)) # prints true


You just have to add more complexity to the member, __getattr__ / __setattr__ gives you the string, so it can be interpreted as needed. The biggest "problem" doing this is that the return might no be consistent and piping it back to a library that expect an object to have a specific behavior can cause soft errors.

This example is not the same as yours, but it has the same concept, manipulate columns with members. To get a copy with changes a set is not needed, with a copy, modify and return, the new instance can be created with whatever needed.

For example, the __getattr__ in this line will:

  1. Check and interpret the string xyz_mull_0
  2. Validate that the members and the operand exists
  3. Make a copy of data_a
  4. Modify the copy and return it
var = data_a.xyz_mull_0()

This looks more complex that it actually is, with the same instance members its clear what it is doing, but the _of modifier needs a callback, this is because the __getattr__ can only have one parameter, so it needs to save the attr and return a callback to be called with the other instance that then will call back to the __getattr__ and complete the rest of the function.

import re

class FlexibleFrame:

    operand_mod = {
        'sub': lambda a, b: a - b,
        'add': lambda a, b: a + b,
        'div': lambda a, b: a / b,
        'mod': lambda a, b: a % b,
        'mull': lambda a, b: a * b,
    }

    @staticmethod
    def add_operand(name, func):
        if name not in FlexibleFrame.operand_mod.keys():
            FlexibleFrame.operand_mod[name] = func

    # This makes this class subscriptable 
    def __getitem__(self, item):
        return self.__dict__[item]

    # Uses:
    #   -> object.value
    #   -> object.member()
    #   -> object.<name>_<operand>_<name|int>()
    #   -> object.<name>_<operand>_<name|int>_<flow>()

    def __getattr__(self, attr):
        if re.match(r'^[a-zA-Z]+_[a-zA-Z]+_[a-zA-Z0-9]+(_of)?$', attr):
            seg = attr.split('_')
            var_a, operand, var_b = seg[0:3]

            # If there is a _of: the second operand is from the other 
            # instance, the _of is removed and a callback is returned 
            if len(seg) == 4:
                self.__attr_ref = '_'.join(seg[0:3])
                return self.__getattr_of

            # Checks if this was a _of attribute and resets it
            if self.__back_ref is not None:
                other = self.__back_ref
                self.__back_ref = None
                self.__attr_ref = None
            else:
                other = self

            if var_a not in self.__dict__:
                raise AttributeError(
                    f'No match of {var_a} in (primary) {__class__.__name__}'
                )
            if operand not in FlexibleFrame.operand_mod.keys():
                raise AttributeError(
                    f'No match of operand {operand}'
                )

            # The return is a copy of self, if not the instance
            # is getting modified making x = a.b() useless
            ret = FlexibleFrame(**self.__dict__)

            # Checks if the second operand is a int
            if re.match(r'^\d+$', var_b) :
                ref_b_num = int(var_b)
                for i in range(len(self[var_a])):
                    ret[var_a][i] = FlexibleFrame.operand_mod[operand](
                        self[var_a][i], ref_b_num
                    )
            elif var_b in other.__dict__:
                for i in range(len(self[var_a])):
                    # out_index = operand[type](in_a_index, in_b_index)
                    ret[var_a][i] = FlexibleFrame.operand_mod[operand](
                        self[var_a][i], other[var_b][i]
                    )
            else:
                raise AttributeError(
                    f'No match of {var_b} in (secondary) {__class__.__name__}'
                )

            # This swaps the .member to a .member()
            # it also adds and extra () in __getattr_of
            return lambda: ret
            # return ret

        if attr in self.__dict__:
            return self[attr]

        raise AttributeError(
            f'No match of {attr} in {__class__.__name__}'
        )

    def __getattr_of(self, other):
        self.__back_ref = other
        return self.__getattr__(self.__attr_ref)()

    def __init__(self, **kwargs):
        self.__back_ref = None
        self.__attr_ref = None

        #TODO: Check if data columns match in size
        # if not, implement column_<name>_filler=<default>
        for i in kwargs:
            self.__dict__[i] = kwargs[i]


if __name__ == '__main__':
    data_a = FlexibleFrame(**{
        'abc': [i for i in range(10)],
        'nmv': [i for i in range(10)],
        'xyz': [i for i in range(10)],  
    })
    data_b = FlexibleFrame(**{
        'fee': [i + 10 for i in range(10)],
        'foo': [i + 10 for i in range(10)],     
    })

    FlexibleFrame.add_operand('set', lambda a, b: b)

    var = data_a.xyz_mull_0()
    var = var.abc_set_xyz()
    var = var.xyz_add_fee_of(data_b)

As a extra thing, lambdas in python have this thing, so it can make difficult using them when self changes.

It seems you're bending the language to do weird things. I'd take it as a smell that your code is probably getting convoluted but I'm not saying there would never be a use-case for it so here is a minimal example of how to do it:

class Obj:
    def _df_change(self, arg):
        print('change', arg)


class DynAttributes(Obj):
    def __getattr__(self, name):
        return self._df_change(name)


class Something:
    difference = DynAttributes()


a = Something()

b = Obj()

assert a.difference.hello == b._df_change('hello')

When calling setattr , use self.__class__ instead of self

Code sample:

class A:
def __init__(self,names : List[str]):
    for name in names:
        setattr(self.__class__,name,property(fget=self.__create_getter(name)))

def __create_getter(self,name: str):
    def inner(self):
        print(f"invoking {name}")
        return 10
    return inner

a = A(['x','y'])

print(a.x + 1)
print(a.y + 2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM