簡體   English   中英

自定義多輸入原始錯誤返回“TypeError: issubclass() arg 1 must be a class”

[英]Custom Multiple Input Primitive Bug returns “TypeError: issubclass() arg 1 must be a class”

我正在使用 Featuretools 庫來嘗試生成涉及客戶交易的自定義功能。 我測試了 function 並返回了答案,所以我不確定為什么會收到此錯誤。

我嘗試使用以下鏈接: https://featuretools.alteryx.com/en/stable/getting_started/primitives.html

謝謝!

from featuretools.primitives import make_agg_primitive

from featuretools.variable_types import DatetimeTimeIndex, Numeric, Categorical

def test_fun(categorical, datetimeindex):
    
    x = pd.DataFrame({'store_name': categorical, 'session_start_time': datetimeindex})

    x_mode = list(x['store_name'].mode())[0]

    x = x[x['store_name'] == x_mode]

    y = x.session_start_time.diff().fillna(pd.Timedelta(seconds=0))/np.timedelta64(1, 's')    
   
    return y.median()


Test_Fun = make_agg_primitive(function = test_fun,
                                input_types = [Categorical, DatetimeTimeIndex],
                                return_type = [Numeric])
      
fm, fd = ft.dfs(
    entityset = es,
    target_entity = 'customers',
    
    agg_primitives = [Test_Fun],

    cutoff_time = lt,
    cutoff_time_in_index = True,
    include_cutoff_time = False,
    verbose = True,
)

導致以下錯誤

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-492-358f980bb6b0> in <module>
     20                                 return_type = [Numeric])
     21 
---> 22 fm, fd = ft.dfs(
     23     entityset = es,
     24     target_entity = 'customers',

~\Anaconda3\lib\site-packages\featuretools\utils\entry_point.py in function_wrapper(*args, **kwargs)
     38                     ep.on_error(error=e,
     39                                 runtime=runtime)
---> 40                 raise e
     41 
     42             # send return value

~\Anaconda3\lib\site-packages\featuretools\utils\entry_point.py in function_wrapper(*args, **kwargs)
     30                 # call function
     31                 start = time.time()
---> 32                 return_value = func(*args, **kwargs)
     33                 runtime = time.time() - start
     34             except Exception as e:

~\Anaconda3\lib\site-packages\featuretools\synthesis\dfs.py in dfs(entities, relationships, entityset, target_entity, cutoff_time, instance_ids, agg_primitives, trans_primitives, groupby_trans_primitives, allowed_paths, max_depth, ignore_entities, ignore_variables, primitive_options, seed_features, drop_contains, drop_exact, where_primitives, max_features, cutoff_time_in_index, save_progress, features_only, training_window, approximate, chunk_size, n_jobs, dask_kwargs, verbose, return_variable_types, progress_callback, include_cutoff_time)
    259                                       seed_features=seed_features)
    260 
--> 261     features = dfs_object.build_features(
    262         verbose=verbose, return_variable_types=return_variable_types)
    263 

~\Anaconda3\lib\site-packages\featuretools\synthesis\deep_feature_synthesis.py in build_features(self, return_variable_types, verbose)
    287             assert isinstance(return_variable_types, list), msg
    288 
--> 289         self._run_dfs(self.es[self.target_entity_id], RelationshipPath([]),
    290                       all_features, max_depth=self.max_depth)
    291 

~\Anaconda3\lib\site-packages\featuretools\synthesis\deep_feature_synthesis.py in _run_dfs(self, entity, relationship_path, all_features, max_depth)
    412         """
    413 
--> 414         self._build_transform_features(all_features, entity, max_depth=max_depth)
    415 
    416         """

~\Anaconda3\lib\site-packages\featuretools\synthesis\deep_feature_synthesis.py in _build_transform_features(self, all_features, entity, max_depth, require_direct_input)
    576                 input_types = input_types[0]
    577 
--> 578             matching_inputs = self._get_matching_inputs(all_features,
    579                                                         entity,
    580                                                         new_max_depth,

~\Anaconda3\lib\site-packages\featuretools\synthesis\deep_feature_synthesis.py in _get_matching_inputs(self, all_features, entity, max_depth, input_types, primitive, primitive_options, require_direct_input, feature_filter)
    793                              primitive, primitive_options, require_direct_input=False,
    794                              feature_filter=None):
--> 795         features = self._features_by_type(all_features=all_features,
    796                                           entity=entity,
    797                                           max_depth=max_depth,

~\Anaconda3\lib\site-packages\featuretools\synthesis\deep_feature_synthesis.py in _features_by_type(self, all_features, entity, max_depth, variable_type)
    768             if (variable_type == variable_types.PandasTypes._all or
    769                     f.variable_type == variable_type or
--> 770                     any(issubclass(f.variable_type, vt) for vt in variable_type)):
    771                 if max_depth is None or f.get_depth(stop_at=self.seed_features) <= max_depth:
    772                     selected_features.append(f)

~\Anaconda3\lib\site-packages\featuretools\synthesis\deep_feature_synthesis.py in <genexpr>(.0)
    768             if (variable_type == variable_types.PandasTypes._all or
    769                     f.variable_type == variable_type or
--> 770                     any(issubclass(f.variable_type, vt) for vt in variable_type)):
    771                 if max_depth is None or f.get_depth(stop_at=self.seed_features) <= max_depth:
    772                     selected_features.append(f)

TypeError: issubclass() arg 1 must be a class


我想我想通了。 如果有更好的方法,請告訴我!

我不確定為什么文檔中的方法不起作用(它使用函數而不是類並且沒有提及類)。

我能夠利用這個問題的解決方案來解決問題:

如何獲取項目的組均值但排除項目本身?


from featuretools.primitives import AggregationPrimitive

class Test_Fun(AggregationPrimitive):
    
    name = "test_fun"
    input_types = [Categorical, DatetimeTimeIndex]
    return_type = Numeric
    stack_on_self = False
    
    def get_function(self):
        
        def mean_excluding_value(categorical, datetimeindex):
            
            x = pd.DataFrame({'store_name': categorical, 'session_start_time': datetimeindex})
            x_mode = list(x['store_name'].mode())[0]
            x = x[x['store_name'] == x_mode]
            y = x.session_start_time.diff().fillna(pd.Timedelta(seconds=0))/np.timedelta64(1, 's')    
   
            return y.median()

        return mean_excluding_value

     
fm, fd = ft.dfs(
    entityset = es,
    target_entity = 'customers',
    
    agg_primitives = [Test_Fun],

    cutoff_time = lt,
    cutoff_time_in_index = True,
    include_cutoff_time = False,
    verbose = True,
)

在這部分代碼中:

Test_Fun = make_agg_primitive(function = test_fun,
                                input_types = [Categorical, DatetimeTimeIndex],
                                return_type = [Numeric])

return_type應設置為Numeric而不是[Numeric]

這段代碼對我有用:

Test_Fun = make_agg_primitive(function = test_fun,
                                input_types = [Categorical, DatetimeTimeIndex],
                                return_type = Numeric)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM