繁体   English   中英

如何在Python中的列表列表中使用计数器和zip函数?

[英]How to use counters and zip functions with a list of lists in Python?

我有一个清单清单:

results = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]

我创建一个计数器,该计数器在每个列表的每个元素中存储字符数,仅当字符为ATGC [NOT Z]时

The desired output is [[4,3],[4,2]]

**

码:

counters = [Counter(sub_list) for sub_list in results]
    nn =[]
    d = []
    for counter in counters:
            atgc_count = sum((val for key, val in counter.items() if key in "ATGC"))    
            nn.append(atgc_count)
d = [i - 1 for i in nn]
correctionfactor = [float(b) / float(m) for b,m in zip(nn, d)]
print nn
print correctionfactor

"Failed" Output:
[0, 0]
<closed file 'c:/test/zzz.txt', mode 'r' at 0x02B46078>

Desired Output
nn = [[4,3],[4,2]]
correctionfactor = [[1.33, 1.5],[1.33,2]]

**

然后,我计算每个字符(pi)的频率,将其平方,然后求和(然后,我计算het = 1-和)。

The desired output [[1,2],[1,2]] #NOTE: This is NOT the real values of expected output. I just need the real values to be in this format. 

**代码

list_of_hets = []
for idx, element in enumerate(sample):
    count_dict = {}
    square_dict = {}
    for base in list(element):
         if base in count_dict:
            count_dict[base] += 1
        else:
            count_dict[base] = 1
    for allele in count_dict:
        square_freq = (count_dict[allele] / float(nn[idx]))**2
        square_dict[allele] = square_freq        
    pf = 0.0
    for i in square_dict:
        pf += square_dict[i]   # pf --> pi^2 + pj^2...pn^2
    het = 1-pf                    
    list_of_hets.append(het)
print list_of_hets

"Failed" OUTPUT:
[-0.0, -0.0]

**我需要将list_of_hets中的每个元素乘以校正因子

h = [float(n) * float(p) for n,p in zip(correction factor,list_of_hets)
With the values given above:
h = [[1.33, 1.5],[1.33,2]] #correctionfactor multiplied by list_of_hets 

最后,我需要找到h中每个元素的平均值并将其存储在新列表中。

The desired output should read as [1.33, 1.75].

我尝试按照以下示例进行操作( 列表总和;返回总和列表 )。

hs = [mean(i) for i in zip(*h)]

但是我收到以下错误“ TypeError:zip参数#1必须支持迭代”

我知道第一步纠正代码可能会解决它。 我试图手动输入“期望的输出”并运行其余代码,但没有运气。

第一部分可以这样完成:

BASES = {'A', 'C', 'G', 'T'}

results = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
counts = [[sum(c in BASES for c in s) for s in pair] for pair in results]
>>> counts
[[4, 3], [4, 2]]

有了计数后,还可以通过列表理解来计算校正因子:

correction_factors = [[i/float(i-1) for i in pair] for pair in counts]
>>> correction_factors
[[1.3333333333333333, 1.5], [1.3333333333333333, 2.0]]

但您必须注意计数为1的情况,因为这将导致除以零误差。 我不确定如何处理...值1是否合适?

correction_factors = [[i/float(i-1) if i-1 else 1 for i in pair] for pair in counts]

第一张地图遍历结果。 第二个映射替换“ Z”并计数元素。

map(lambda x:map(lambda y:len(y.replace('Z','')),x),l)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM