[英]How do I count the number of identical characters in a string by position using python?
For example:例如:
String 1: AGGCCT
|| | |
String 2: AGCCAT
These two strings are identical at 4 positions, so the function I want would return 4.这两个字符串在 4 个位置相同,因此我想要的 function 将返回 4。
Is there a clever (ie, fast) method for doing this, other than the obvious method of iterating through both strings at the same time?除了同时遍历两个字符串的明显方法之外,是否有一种聪明(即快速)的方法来做到这一点?
Thanks!谢谢! Uri
乌里
I don't think any "clever" trick beats the obvious approach, if it's well executed:如果执行得当,我认为没有任何“聪明”的技巧能胜过显而易见的方法:
sum(c1 == c2 for c1, c2 in itertools.izip(s1, s2))
Or, if the use of booleans for arithmetic irks you,或者,如果在算术中使用布尔值让您感到厌烦,
sum(1 for c1, c2 in itertools.izip(s1, s2) if c1 == c2)
Though I prefer delnan's generator expression, this works as well:虽然我更喜欢 delnan 的生成器表达式,但这也适用:
>>> from itertools import imap
>>> from operator import eq
>>> sum(imap(eq, 'abcde', 'aacce'))
3
If you're looking for better performance, I suspect it will be hard to beat numpy for this:如果您正在寻找更好的性能,我怀疑这将很难击败 numpy:
import numpy as np
a1 = np.frombuffer(s1, dtype=np.byte)
a2 = np.frombuffer(s2, dtype=np.byte)
print (a1==a2).sum()
On my system, this runs about 10x faster than using itertools.在我的系统上,这比使用 itertools 快 10 倍。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.