I am dealing with ancient DNA data. I have an array with n different base pair calls for a given coordinate.
eg, ['A','A','C','C','G']
I need to setup a bit in my script whereby the most frequent call(s) are identified. If there is one, it should use that one. If there are two (or three) that are tied (eg, A and C here), I need it randomly pick one of the two.
I have been looking for a solution but cannot find anything satisfactory. The most frequent solution, I see is Counter, but Counter is useless for me as c.most_common(1) will not identify that 1 and 2 are tied.
You can get the maximum count from the mapping returned by Counter
with the max
function first, and then ues a list comprehension to output only the keys whose counts equal the maximum count. Since Counter
, max
, and list comprehension all cost linear time, the overall time complexity of the code can be kept at O(n) :
from collections import Counter
import random
lst = ['A','A','C','C','G']
counts = Counter(lst)
greatest = max(counts.values())
print(random.choice([item for item, count in counts.items() if count == greatest]))
This outputs either A
or C
.
Something like this would work:
import random
string = ['A','A','C','C','G']
dct = {}
for x in set(string):
dct[x] = string.count(x)
max_value = max(dct.values())
lst = []
for key, value in dct.items():
if value == max_value:
lst.append(key)
print(random.choice(lst))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.