I am stuck on a section of code. In this code I am trying to get all the occurrences of characters(amino) in a string(protein). There are two letters( ['M','L'] ) that I need to find in the string. When I use .count I get 1 for "M" and 10 for "L". The problem is I cannot find the right way to add both of the counts from both letters together to get 11.
protein = "MSRSLLLRFLLFLLLLPPLP"
aa = ['M', 'L']
for aminos in aa:
if aminos in protein:
protein.count(aminos)
There are many ways to do this. The following is one of the many.
from collections import Counter
protein = "MSRSLLLRFLLFLLLLPPLP"
aminos = ['M', 'L']
# Count occurrences of all characters
amino_counter = Counter(protein)
total_count = 0
# Only consider the counts of aminos that matter
for amino in aminos:
total_count += amino_counter.get(amino, 0)
print(total_count)
There are lots of ways to do this. For example,
total = 0
for aminos in aa:
# No need to check if aminos in protein because .count() returns 0 if that's the case
total += protein.count(aminos)
sum()
to add up all the values for the count()
of each amino
in aa
.total = sum(protein.count(amino) for amino in aa)
aa
. But first, convert aa
to a set
to make membership-checks less expensive.s_aa = set(aa)
total = sum(p in s_aa for p in protein)
This works because p in s_aa
evaluates to True
if p
is in s_aa
, and False
otherwise. True
counts as one, False
counts as zero, so when you sum
a bunch of True/False
values, you get the number of True
values.
protein
, then sum the counts for the ones you care about:counts = {}
for p in protein:
ct = counts.get(p, 0) # get counts[p], default to 0 if not exists
counts[p] = ct + 1
total = sum(counts.get(amino, 0) for amino in aa)
Vignesh's collections.Counter
technique is the same as this approach. It's better than Hamza's approach for counting the elements because it iterates over the protein
string just once instead of once for each element of aa
. This is also the reason my third or fourth approaches are better than #1 and #2.
Easiest way might be:
sum(protein.count(a) for a in aa)
You can also get indivdual counts as:
all_counts = {a:protein.count(a) for a in aa}
results in: {'M': 1, 'L': 10}
Which you can further sum if you only need the total count:
sum(all_counts.values())
which results in: 11
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.