I have a list of lists :
results = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
I create a counter that stores number number of characters in each element in each list, only if the characters are ATGC [NOT Z]
The desired output is [[4,3],[4,2]]
**
Code:
counters = [Counter(sub_list) for sub_list in results]
nn =[]
d = []
for counter in counters:
atgc_count = sum((val for key, val in counter.items() if key in "ATGC"))
nn.append(atgc_count)
d = [i - 1 for i in nn]
correctionfactor = [float(b) / float(m) for b,m in zip(nn, d)]
print nn
print correctionfactor
"Failed" Output:
[0, 0]
<closed file 'c:/test/zzz.txt', mode 'r' at 0x02B46078>
Desired Output
nn = [[4,3],[4,2]]
correctionfactor = [[1.33, 1.5],[1.33,2]]
**
And then I calculate frequency of each character (pi), square it and then sum (and then I calculate het = 1 - sum).
The desired output [[1,2],[1,2]] #NOTE: This is NOT the real values of expected output. I just need the real values to be in this format.
** Code
list_of_hets = []
for idx, element in enumerate(sample):
count_dict = {}
square_dict = {}
for base in list(element):
if base in count_dict:
count_dict[base] += 1
else:
count_dict[base] = 1
for allele in count_dict:
square_freq = (count_dict[allele] / float(nn[idx]))**2
square_dict[allele] = square_freq
pf = 0.0
for i in square_dict:
pf += square_dict[i] # pf --> pi^2 + pj^2...pn^2
het = 1-pf
list_of_hets.append(het)
print list_of_hets
"Failed" OUTPUT:
[-0.0, -0.0]
** I need to multiply every elements in list_of_hets by correction factor
h = [float(n) * float(p) for n,p in zip(correction factor,list_of_hets)
With the values given above:
h = [[1.33, 1.5],[1.33,2]] #correctionfactor multiplied by list_of_hets
Finally, I need to find the average value of every element in h and store it in a new list.
The desired output should read as [1.33, 1.75].
I tried following this example ( Sum of list of lists; returns sum list ).
hs = [mean(i) for i in zip(*h)]
But I get the following error "TypeError: zip argument #1 must support iteration"
I understand that correcting the code at the first step may solve it. I tried to manually input the "desired outputs" and run the rest of the code, but no luck.
The first part can be done like this:
BASES = {'A', 'C', 'G', 'T'}
results = [['TTTT', 'CCCZ'], ['ATTA', 'CZZC']]
counts = [[sum(c in BASES for c in s) for s in pair] for pair in results]
>>> counts
[[4, 3], [4, 2]]
Once you have the counts, the correction factor can also be calculated with a list comprehension:
correction_factors = [[i/float(i-1) for i in pair] for pair in counts]
>>> correction_factors
[[1.3333333333333333, 1.5], [1.3333333333333333, 2.0]]
but you do need to be careful of the case where a count is 1 as this will lead to a division by zero error. I'm not sure how you should handle that... would a value of 1
be appropriate?
correction_factors = [[i/float(i-1) if i-1 else 1 for i in pair] for pair in counts]
First map traverse results. Second map replace 'Z' and count element.
map(lambda x:map(lambda y:len(y.replace('Z','')),x),l)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.