简体   繁体   中英

Efficient way to generate partitions with value pair of a list >15 elements

I am generating a list of partitions from a list of elements (akin to partitions of a set or set partitions). The problem is for each of these partitions I need to assigned a random number indicating their value so I can run some computations on later on the output data consisting of a partition = value pair.

A sample would be a csv with sample entries as below:

p,v   
"[[1, 2, 3, 4]]",0.3999960625186746
"[[1], [2, 3, 4]]",0.49159520559753156
"[[1, 2], [3, 4]]",0.12658202037597555
"[[1, 3, 4], [2]]",0.11670775560336522
"[[1], [2], [3, 4]]",0.006059031164368345

Here is the code I have put together for this:

from collections import defaultdict
import random
import csv

partitions = []

elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)

# simple menu
if size  == 1:
    partitionlist = range(1,size+1)
    print ('A one element list have 1 partition')
elif size < 28:
    partitionlist = range(1,size+1)
elif size >= 28:
    partitionlist = [0]
    print ("Invalid number. Try again...")

# generate all partitions
def partition(elements):
    if len(elements) == 1:
        yield [ elements ]
        return

    first = elements[0]
    for smaller in partition(elements[1:]):
        # insert `first` in each of the subpartition's subsets
        for n, subset in enumerate(smaller):
            yield smaller[:n] + [[ first ] + subset]  + smaller[n+1:]
        # put `first` in its own subset 
        yield [ [ first ] ] + smaller

for p in partition(partitionlist):
    partitions.append([sorted(p)] + [random.uniform(0,1)])

# write the generated input to CSV file
data = partitions

def partition_value_data(size):
    with open( size+'-elem-normaldist.csv','w') as out:
        csv_out=csv.writer(out)
        csv_out.writerow(['p','v'])
        for row in data:
            csv_out.writerow(row)

partition_value_data(fileheader)

The problem I'm facing is that when the number of elements goes over 13, I get a memory error. Is it due to my computers memory or a limit within Python itself. I'm using Python 2.7.12.

for a list with 15 elements the number of partitions is approx. 1382958545

I'm trying to generate a partitions of a list of up to 30 elements where the number of partitions would be approx. 545717047947902329359

Any advice is really appreciated. thank you.

Your issue here is that you're combining a generator with turning it into a list, which totally negates any benefit from creating a generator.

Instead, you should just be writing out directly from your generator.

from collections import defaultdict
import random
import csv

elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)

# simple menu
if size  == 1:
    partitionlist = range(1,size+1)
    print ('A one element list have 1 partition')
elif size < 28:
    partitionlist = range(1,size+1)
elif size >= 28:
    partitionlist = [0]
    print ("Invalid number. Try again...")

# generate all partitions
def partition(elements):
    if len(elements) == 1:
        yield [ elements ]
        return

    first = elements[0]
    for smaller in partition(elements[1:]):
        # insert `first` in each of the subpartition's subsets
        for n, subset in enumerate(smaller):
            yield smaller[:n] + [[ first ] + subset]  + smaller[n+1:]
        # put `first` in its own subset 
        yield [ [ first ] ] + smaller


def partition_value_data(size):
    with open( size+'-elem-normaldist.csv','w') as out:
        csv_out=csv.writer(out)
        csv_out.writerow(['p','v'])

        for row in partition(partitionlist):
            csv_out.writerow([sorted(row)] + [random.uniform(0,1)])

partition_value_data(fileheader)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM