简体   繁体   English

Python函数不会为内部循环中的每次迭代调用给定参数

[英]Python function does not call given argument for every iteration in inner loop

I have written this code: 我写了这段代码:

class_1500_strings = ['transistor', 'resistor', 'diode', 'processor', 'thermistor', '555-timer', 'microcontroller']

class_1500 = {'conductivity' : gaussian_sample(100, 10, 250),
              'price_per_unit' : gaussian_sample(10, 2, 250),
              'number_bought' : categorical_sample(0, 10, 250),
              'manufacturer' : string_sample(250, class_1500_strings),
              'acquisition_date' : date_random_sample("1/1/2008 1:30 PM", "1/1/2009 4:50 AM", col_length=250),
              'runtime' : gaussian_sample(1000, 200, 250)

def generate_table(class_dict, class_label, number_of_samples):
    X, y = [], []
    for table_idx in range(number_of_samples):
        df = pd.DataFrame(class_dict)
        label = class_label
        X.append(df)
        y.append(label)
    return X, y

X, y = generate_table(class_1500, 0, 5)

The purpose is to build sample artificial dataframes. 目的是建立样本人工数据框。 The problem I have is that X is a list of identical dataframes, instead of calling the random generators inside the class dictionary. 我的问题是X是相同数据帧的列表,而不是在类字典中调用随机生成器。 How can I make the function produce a list of different datasets (ie call the samplers every time it runs the loop)? 如何使函数产生不同数据集的列表(即,每次运行循环都调用采样器)?

You need to create a new dictionary for each dataframe you construct. 您需要为构造的每个数据框创建一个新的字典。 With your current logic, as soon as class_1500 is defined, it has lost all connection with the random generator logic as the values are all array-like. 使用您当前的逻辑,一旦class_1500被定义,它就失去了与随机生成器逻辑的所有联系,因为它们的值都是类似数组的。

One way is to define a separate function which gives different arrays each time it is run: 一种方法是定义一个单独的函数,该函数每次运行时都会提供不同的数组:

def make_data():
     return {'conductivity' : gaussian_sample(100, 10, 250),
             ...
             'runtime' : gaussian_sample(1000, 200, 250)}

def generate_table(class_label, number_of_samples):
    X, y = [], []
    for table_idx in range(number_of_samples):
        df = pd.DataFrame(make_data())
        label = class_label
        X.append(df)
        y.append(label)
    return X, y

X, y = generate_table(0, 5)

You are constructing a DataFrame using the same value in each iteration of your loop (class_dict). 您正在循环的每次迭代(class_dict)中使用相同的值构造一个DataFrame。 If you want the DataFrame value to be different for each iteration, you'll have to provide a different value. 如果希望每个迭代的DataFrame值都不同,则必须提供一个不同的值。 Try updating your for loop to be for key in class_dict , and for the argument of DataFrame, provide key . 尝试将for循环更新为for key in class_dict ,并为DataFrame的参数提供key

That would make it so that you have one DataFrame for each key of your dictionary, where the values of the DataFrames are generated by the values of the dictionary keys (the sample functions). 这样就可以使字典的每个键都有一个DataFrame,其中DataFrame的值由字典键的值(示例函数)生成。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM