在python中正确使用列表和集合

Question

I try to write a-priori algorithm in python and I have a problem when the algorithm have to check the k-dimensional itemsets. 我尝试在python中编写先验算法，当算法必须检查k维项目集时遇到问题。 So far, I have written this code: 到目前为止，我已经编写了以下代码：

def A_Priori_Algorithm_Next_Passes(file, freqk, k, s):

    input_file = open(file, 'r')
    csv_reader = csv.reader(input_file, delimiter=',')

    baskets = []

    for row in csv_reader:
        unique_row_items = set([field.strip().lower() for field in row])
        baskets.append(unique_row_items)

    input_file.close()
    all_items = []
    counts = {}
    freq = {}
    length = len(baskets)
    i = 0

    while(i < length):
        items = GetUniqueItems(baskets[i])
        items_list = list(items)
        length_1 = len(items_list)
        itemset_pairs = GetPairs(freqk)
        u = 0
        while(u < len(itemset_pairs)):
            all_items.append(tuple(itemset_pairs[u]))
            u = u + 1
        candidates = []
        q = 0
        while(q < len(itemset_pairs)):
            a1 = itemset_pairs[q][0]
            a2 = itemset_pairs[q][1]
            #print(a1)
            #print(a2)
            #candidate_sum = a1 + ',' + a2
            candidate_set = set(a1).union(set(a2))
            candidate = []
            candidate.append(candidate_set)
            if(tuple(candidate) not in candidates):
                candidates.append(tuple(candidate))
                if((len(candidate) == (k + 1)) and ((candidate < items) == True)):
                    #print(candidate)
                    if(tuple(candidate) not in counts):
                        counts[tuple(candidate)] = 1
                    else:
                        counts[tuple(candidate)] = counts[tuple(candidate)] + 1
            q = q + 1
        i = i + 1      
    i = 0
    while(i < len(all_items)):
        if(all_items[i] in counts):
            if(counts[tuple(all_items[i])] >= s):
                freq[all_items[i]] = counts[all_items[i]]
        i = i + 1

    return freq

My problem is that I can't recognise when to use list and when to use a set. 我的问题是我无法识别何时使用列表以及何时使用集合。 In this if-statement "if((len(candidate) == (k + 1)) and ((candidate < items) == True)):" the program never gets in. Have you any idea of what I haven't understand? 在此if语句中，“ if（（len（candidate）==（k + 1））and（（candidate <items）== True））：”该程序永远不会进入。您对我所没有的了解吗？不明白吗？ the pseudocode for the algorithm is: 该算法的伪代码为：

Algorithm: A-Priori algorithm (k + 1) pass.

Input: F, a file containing baskets

Input: freqk, a table containg the frequencies of itemsets of size k in          baskets above the threshold s

Input: k, the size of the itemsets in freqk

Input: s, the support

Output: freq, a table containg the frequencies of itemsets of size k + 1 with threshold s

1 counts ← ∅

2 freq ← ∅

3 foreach basket in F do

4 items ← GetUniqueItems(basket)

5 itemset_pairs = GetPairs(freqk)

6 candidates ← ∅

7 foreach pair in itemset_pairs do

8 (fp,sp) ← pair

9 candidate ← fp ∪ sp

10 if not candidate in candidates then

11 Add(candidates, candidate)

12 if |candidate| = k + 1 and candidate ⊆ items then

13 counts[candidate] ← counts[candidate] + 1

14 foreach itemset, count in counts do

15 if count ≥ s then

16 freq[itemset] = count

17 return freq

Thanks in advance! 提前致谢！

Answer 1

Sets are superior for testing membership (if x in set), and a set must contain hashable data, and cannot/will not contain duplicates (try set([1, 1, 3, 4]) ). 集合对于测试成员资格（如果x在集合中）优越，并且一个集合必须包含可散列的数据，并且不能/将不包含重复项（请尝试set([1, 1, 3, 4]) 1，1，3，4 set([1, 1, 3, 4]) ）。 Sets make available a lot of set theory functions, eg, intersection. 集使许多集理论功能可用，例如交集。 They are slower for adding members, are not ordered, and it's generally a good idea to use a list() if you don't have a good reason to use a set(). 它们添加成员的速度较慢，没有顺序，如果您没有充分的理由使用set（），通常最好使用list（）。 I encourage you to read the official python documentation on set() . 我鼓励您阅读set（）上的官方python文档。

在python中正确使用列表和集合

问题描述

1 个解决方案

解决方案1
0 2015-05-18 19:18:31

在python中正确使用列表和集合

问题描述

1 个解决方案

解决方案1 0 2015-05-18 19:18:31

解决方案1
0 2015-05-18 19:18:31