python 2.7 : create dictionary from list of sets

Question

After performing some operations I get a list of set as following :

from pyspark.mllib.fpm import FPGrowth

FreqItemset(items=[u'A_String_0'], freq=303)
FreqItemset(items=[u'A_String_0', u'Another_String_1'], freq=302)
FreqItemset(items=[u'B_String_1', u'A_String_0', u'A_OtherString_1'], freq=301)

I'd like to create from this list :

RDD

Dictionary , for example :

 key: A_String_0 value: 303 key: A_String_0,Another_String_1 value: 302 key: B_String_1,A_String_0,A_OtherString_1 value: 301

I'd like to continue with calculations to produce Confidence and Lift

I tried to execute for loop to get each item from list .

The question is if there is another , better way to create rdd and/or lists here ?

Thank you in advance .

Answer 1

If you want a RDD simply don't collect freqItemsets
```
 model = FPGrowth.train(transactions, minSupport=0.2, numPartitions=10) freqItemsets = model.freqItemsets() 
```
you can of course parallelize
result = model.freqItemsets().collect() sc.parallelize(result)
I am not sure why you need this (it looks like a XY problem but you can use comprehensions on the collected data:
```
 {tuple(x.items): x.freq for x in result} 
```
or
```
 {",".join(x.items): x.freq for x in result} 
```

Generally speaking if you want to apply further transformations on your data don't collect and process data directly in Spark.

Also you should take a look at the Scala API. It already implements association rules .

python 2.7 : create dictionary from list of sets

Question

1 answers

solution1
1 ACCPTED 2015-12-17 20:13:42

python 2.7 : create dictionary from list of sets

Question

1 answers

solution1 1 ACCPTED 2015-12-17 20:13:42

solution1
1 ACCPTED 2015-12-17 20:13:42