什么在此python腳本中使用所有RAM？

Question

我有以下非常簡單的python腳本。 在我的使用中，它只計算DNA文本文件中長度為2的不同字符串的數量。

#!/usr/bin/python
#Count the number of distinct kmers in a file
import sys
def kmer_count(dna, k):
    total_kmers = len(dna) - k + 1
    # assemble dict of kmer counts
    kmer2count = {}
    for x in range(len(dna)+1-k):
        kmer = dna[x:x+k]
        kmer2count[kmer] = kmer2count.get(kmer, 0) + 1
    return(len(kmer2count))


workfile = "test.fa"
f = open(workfile, 'r')
dna = f.readline()
print "Number of bytes to represent input", sys.getsizeof(dna)
print "Number of items in dict", kmer_count(dna, 2)

此打印

Number of bytes to represent input 10000037
Number of items in dict 71

但是當我使用

/usr/bin/time --format="Size:%MK  Cpu:%P  Elapsed:%e" ./kmer.py

我懂了

Size:332776K  Cpu:100%  Elapsed:2.57

什么在使用所有RAM？

Answer 1

您在for循環中使用了range ，該循環構造了一個包含所有數字的列表。 這勢必很大。

在Python 2中，改為在xrange循環：xrange根據需要懶惰地為for循環創建數字。

什么在此python腳本中使用所有RAM？

問題描述

1 個解決方案

解決方案1
1 已采納 2015-02-01 19:57:39

什么在此python腳本中使用所有RAM？

問題描述

1 個解決方案

解決方案1 1 已采納 2015-02-01 19:57:39

解決方案1
1 已采納 2015-02-01 19:57:39