Python assign different variables to a class object

Question

This is a general python question. Is it possible to assign different variables to a class object and then perform different set of operations on those variables? I'm trying to reduce code but maybe this isn't how it works. For example, I'm trying to do something like this:

Edit: here is an abstract of the class and methods:

class Class:
    def __init__(self, df):
        self.df = df

    def query(self, query):
        self.df = self.df.query(query)
        return self

    def fill(self, filter):
        self.df.update(df.filter(like=filter).mask(lambda x: x == 0).ffill(1))
        return self

    def diff(self, cols=None, axis=1):
        diff = self.df[self.df.columns[~self.df.columns.isin(cols)]].diff(axis=axis)
        self.df = diff.join(self.df[self.df.columns.difference(diff.columns)])
        return self

    def melt(self, cols, var=None, value=None):
        return pd.melt(self.df, id_vars=columns, var_name=var, value_name=value)

I'm trying to use it like this:

df = pd.read_csv('data.csv')

df = Class(df)
df = df.query(query).forward_fill(include)

df_1 = df.diff(cols).melt(cols)

df_2 = df.melt(cols)

df_1 and df_2 should have different values, however they are the same as df_1 . This issue is resolved if I use the class like this:

df_1 = pd.read_csv('data.csv')
df_2 = pd.read_csv('data.csv')

df_1 = Class(df_1)
df_2 = Class(df_2)

df_1 = df_1.query(query).forward_fill(include)
df_2 = df_2.query(query).forward_fill(include)

df_1 = df_1.diff(cols).melt(cols)

df_2 = df_2.melt(cols)

This results in extra code. Is there a better way to do this where you can use an object differently on different variables, or do I have to create seperate objects if I'm trying to have two variables perform separate operations and return different values?

Answer 1

With the return self statement in the diff - method you return the reference of the object. The same thing happens after the melt method. But in that two methods you allreadey manipulated the origin df .

Here:

1 df = pd.read_csv('data.csv')
2
3 df = Class(df)
4 df = df.query(query).forward_fill(include)
5 
6 df_1 = df.diff(cols).melt(cols)

the df has the same values like df_1 . I guess the melt method without other args then cols arguments only assigns col names or something like that. Subsequently df_2=df.melt(cols) would have the same result like df_2=df_1.melt(cols) .

If you want to work with one object, you dont should use self.df=... in your class methods, because this changes the instance value of df . You only need to write df =... and than return Class(df) .

For example:

def diff(self, cols=None, axis=1):
    diff = self.df[self.df.columns[~self.df.columns.isin(cols)]].diff(axis=axis)
    df = diff.join(self.df[self.df.columns.difference(diff.columns)])
    return Class(df)

Best regards

Python assign different variables to a class object

Question

1 answers

solution1
1 ACCPTED 2020-06-29 19:49:21

Python assign different variables to a class object

Question

1 answers

solution1 1 ACCPTED 2020-06-29 19:49:21

solution1
1 ACCPTED 2020-06-29 19:49:21