简体   繁体   中英

Problem with Python Code for finding length between two strings in a text file

Hello I am quite new to python and started taking classes for biologists but I have a problem with an assignment in python and just can't figure it out. From a .txt file i should find 2 restriction enzymes (basically just letters), "gatc" with an g or a in front and c or t in the back so: "[ga]gatc[ct]". This is 2 times in the text file and i should find out the length between them(xxxx[ga]gatc[ct] xxxxxxx [ga]gatc[ct]xxxx) -->how many x are between them . I tried to put it in groups but i make something wrong. xxxx is an unknown number of letters that is made up of "g" "a" "t" "c" : like ctactatctcatcttaaccttaa for example

My current code is:

import regex
file = "enzyme.txt"
f=open(file, "r")
content = f.read()
print(content)
pattern = regex.compile("[ga]gatc[ct]")
for line in open("enzyme.txt"):
   for match in regex.finditer (pattern, line):
        print(match.group())
        print(line)
for lines in f:
    m=regex.search("[ga]gatc[ct] {*} [ga]gatc[ct]", lines)
    if m:
        print(len(str(m.start(1)) + str(m.end(2))))

it shows me the correct sequence and prints the line in which it is but i don't know how to find the length in between them. the second part of the code doesn't do anything but also shows no error message.

In my perspective this will be a naive solution.

pattern = "[ga]gatc[ct]"

with open("enzyme.txt") as file:
    for line in file:
        parsed = line.split(pattern)[1]
        print(len(parsed))
  1. str.split will divide the line into pieces according to given pattern
  2. IN your case the pattern will be [ga]gatc[ct]
  3. Now you need to access the index 1 for the xxxxxxxx because index 0 will be '' . An empty string.
  4. Now you wanted the length of text in between the pattern so, print(len(parsed))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM