简体   繁体   中英

Python function does not call given argument for every iteration in inner loop

I have written this code:

class_1500_strings = ['transistor', 'resistor', 'diode', 'processor', 'thermistor', '555-timer', 'microcontroller']

class_1500 = {'conductivity' : gaussian_sample(100, 10, 250),
              'price_per_unit' : gaussian_sample(10, 2, 250),
              'number_bought' : categorical_sample(0, 10, 250),
              'manufacturer' : string_sample(250, class_1500_strings),
              'acquisition_date' : date_random_sample("1/1/2008 1:30 PM", "1/1/2009 4:50 AM", col_length=250),
              'runtime' : gaussian_sample(1000, 200, 250)

def generate_table(class_dict, class_label, number_of_samples):
    X, y = [], []
    for table_idx in range(number_of_samples):
        df = pd.DataFrame(class_dict)
        label = class_label
        X.append(df)
        y.append(label)
    return X, y

X, y = generate_table(class_1500, 0, 5)

The purpose is to build sample artificial dataframes. The problem I have is that X is a list of identical dataframes, instead of calling the random generators inside the class dictionary. How can I make the function produce a list of different datasets (ie call the samplers every time it runs the loop)?

You need to create a new dictionary for each dataframe you construct. With your current logic, as soon as class_1500 is defined, it has lost all connection with the random generator logic as the values are all array-like.

One way is to define a separate function which gives different arrays each time it is run:

def make_data():
     return {'conductivity' : gaussian_sample(100, 10, 250),
             ...
             'runtime' : gaussian_sample(1000, 200, 250)}

def generate_table(class_label, number_of_samples):
    X, y = [], []
    for table_idx in range(number_of_samples):
        df = pd.DataFrame(make_data())
        label = class_label
        X.append(df)
        y.append(label)
    return X, y

X, y = generate_table(0, 5)

You are constructing a DataFrame using the same value in each iteration of your loop (class_dict). If you want the DataFrame value to be different for each iteration, you'll have to provide a different value. Try updating your for loop to be for key in class_dict , and for the argument of DataFrame, provide key .

That would make it so that you have one DataFrame for each key of your dictionary, where the values of the DataFrames are generated by the values of the dictionary keys (the sample functions).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM