简体   繁体   中英

How to write code more beatify and can speed up program in my case?

I have a class "DataProcessing" is for processing with dataframe "df".

With processing df have some step like this:

First, get dataframe "df" from csv file.

Second, concat df and constant df.

Third, calculate some indicators with df.

Fourth, judge strategy with df data.

Fifth, multiprocessing fourth step.

Here is the code I write just using a function "data_processing":

class DataProcessing:
    def __init__(self, constant_df):
        self.constant_df = constant_df 

    # First, get dataframe "df" from csv file. 
    def data_processing(self, code):
        self.df = pd.read_csv(...)

        # Second, concat df and constant df.
        self.df = pd.concat([self.df, self.constant_df]) # Should I naming a new variable?

        # Third,  calculate some indicators with df. 
        self.df = do some calculating...

        # Fourth, judge strategy with df data. 
        if self.df do some judge...

    # Fifth, multiprocessing fourth step.
    def multiprocessing(self, code):
        # use multiprocessing Pool...

or I can use multiple functions like this

class DataProcessing: 
    def __init__(self, constant_df):
        self.constant_df = constant_df 

    # First, get dataframe "df" from csv file. 
    def get_df(self, code):
        self.df = pd.read_csv(...)
        return 

    # Second, concat df and constant df.
    def concat_df(self, code):
        self.get_df(code) # Need run get_df function to get self.df first 
        self.df = pd.concat([self.df, self.constant_df])
        return 
    
    # Third,  calculate some indicators with df. 
    def calculate_indicators(self, code):
        self.concat_df(code) # Need run concat_df function to get self.df that concated 

    # Fourth, judge strategy with df data. 
    def judge_strateg(self, code):
        self.calculate_indicators(code) # Need run calculate_indicators function first 
        if ... # do some judge 

    # Fifth, multiprocessing fourth step.
    def multiprocessing(self):
        # use multiprocessing Pool...

But if I use multiple functions, I need to run the previous function in the function first and pass the same arg "code", that makes me doubt is there have another way to write code beautiful?

Thanks

Why not saving the code as a class variable?

    def __init__(self, constant_df, code):
        self.constant_df = constant_df 
        self.code = code
...
    # Fifth, multiprocessing fourth step.
    def multiprocessing(self):
        self.code # use it there      

or you have a overall-fuction that does it all

    def __init__(self, constant_df):
        self.constant_df = constant_df 

    # Overall
    def process_all(self, code):
        df.self = get_df()
        df.self = concat_df()
        #...
        df.self = calculate_indicators(self, code)
        #...

    # First, get dataframe "df" from csv file. 
    def get_df(self):
        self.df = pd.read_csv(...)
        return 

    # Second, concat df and constant df.
    def concat_df(self):
        self.get_df(code) # Need run get_df function to get self.df first 
        self.df = pd.concat([self.df, self.constant_df])
        return 
    
    # Third,  calculate some indicators with df. 
    def calculate_indicators(self, code):
        self.concat_df(code) # Need run concat_df function to get self.df that concated 

    # Fourth, judge strategy with df data. 
    def judge_strateg(self, code):
        self.calculate_indicators(code) # Need run calculate_indicators function first 
        if ... # do some judge 

    # Fifth, multiprocessing fourth step.
    def multiprocessing(self):
        # use multiprocessing Pool...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM