簡體   English   中英

Python 先驗返回生成器而不是 Dataframe

[英]Python apriori returning Generator instead of Dataframe

我正在編寫獲取數據集(購物籃)的一小部分的代碼,將其轉換為熱編碼的 dataframe 並且我想在其上運行 mlxtend 的先驗算法以獲得頻繁項集。

但是,每當我運行 apriori 算法時,它似乎會立即運行並返回生成器 object 而不是 dataframe。 我按照文檔中的說明進行操作,在他們的示例中,它顯示 apriori 返回 dataframe。 我究竟做錯了什么?

這是我的代碼:

import numpy as np
import pandas as pd
import csv
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules
from mlxtend.preprocessing import TransactionEncoder
from apyori import apriori

def simpleRandomisedSample(filename, support_frac, sample_frac):
    df1 = pd.read_csv("%s.csv" % filename, header=None) #Saving csv file into a dataframe in memory
    size = len(df1)
    support = support_frac * len(df1) #Sets the original support value to x% of the original dataset
    sample_support = support * sample_frac #Support for our reduced sample as a fraction of the original support
    sample = df1.sample(frac=sample_frac) #Saving x% (randomised) of the dataset as our sample
    sample = sample.reset_index(drop = True) #Reseting indexes (which previously got randomised along with the data)
    del df1 #Deleting original dataframe from memory to clear up space
    sample_size = len(sample)
    return size, support, sample_size, sample_support, sample

def main():
    size, support, sample_size, sample_support, sample = simpleRandomisedSample("chess",0.01,0.1)
    print("The original dataset had %d rows and a support of %.2f" % (size, support))
    print("The dataset was reduced to %d rows and the sample has a support of %.2f" % (sample_size, sample_support)) 

    sample_list = sample.values.tolist() #Converting Dataframe to list of lists for use with Apriori
    te = TransactionEncoder()
    te_ary = te.fit(sample_list).transform(sample_list) #Preprocessing our sample to work with Apriori algorithm
    df = pd.DataFrame(te_ary, columns=te.columns_)
    print(df)
    frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)
    print(frequent_itemsets)
    
if __name__ == "__main__":
    main()

您的導入中有名稱沖突:

from mlxtend.frequent_patterns import apriori
[...]
from apyori import apriori

您的代碼沒有使用mlxtend算法,而是使用apyori提供的算法,后期導入的算法會覆蓋前一個算法。

您可以刪除您不使用的那個,或者,如果您想稍后訪問這兩個,您可以給一個不同的名稱:

from mlxtend.frequent_patterns import apriori as mlx_apriori
from apyori import apriori as apy_apriori

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM