简体   繁体   English

如何在 python 的字符串中找到 substring 的出现次数?

[英]How can I find the number of occurences of a substring in a string in python?

I have a substring 'G^ATTC' and I want to find the number of time it occurs in a string like 'ATCGCGATTC' but I cannot because of '^' .我有一个 substring 'G^ATTC' ,我想找到它出现在像'ATCGCGATTC'这样的字符串中的次数,但我不能因为'^'

I used re.findall , but the result is always 0 .我使用re.findall ,但结果始终0

This is because in Regex, the "^" character means "the start of the line."这是因为在 Regex 中,“^”字符表示“行首”。 Related to this, "$" means "the end of a line"与此相关,“$”表示“行尾”

So, when it's searching for "G^ATTC", it would never match anything, because you're saying the "G" comes before the start of the line (which doesn't even make sense).所以,当它搜索“G^ATTC”时,它永远不会匹配任何东西,因为你说“G”出现在行首之前(这甚至没有意义)。

The way to fix your regex is to include a "\" to escape the "^".修复正则表达式的方法是包含一个“\”来转义“^”。 This tells regex to treat the "^" as a character instead of the start of the line.这告诉正则表达式将“^”视为字符而不是行首。

So, change it to "G\^ATTC"所以,把它改成“G\^ATTC”

maybe something like this:也许是这样的:

import re

txt = "ATCGCG1ATTCAAAAAAAAAAAAAG4ATTC"
substring =  'G^ATTC'
x = re.findall(substring.replace('^','.'), txt) # ['G1ATTC', 'G4ATTC']
print ("pattern {} occurs {} times".format(substring,len(x)))

output: output:

pattern G^ATTC occurs 2 times

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM