I know nothing about python and I've tried to piece together information from various thread to complete an assignment but I still can't crack it.
Here is the assignment:
Instructions a) Download the sequence for RAI1 mRNA NM_030665
, and use Python to count the number of ATG subsequences, using:
countATG = seq.count('ATG').
For example, for SREBF1 NM_001005291.2
, the answer is 45
.
I am NOT looking for the answer to the question. I genuinely want to learn more about python and would REALLY appreciate it if someone could tell me how to go about completing this problem. I have the sequence saved to my desktop as a .txt file, but I don't know how to specify that seq1 should equal the data file (if that makes sense). Yes, I could Ctrl+F the sequence on NCBI, but I want to learn how to use python.
Thank you!!
Here you go:
filepath = '/path/to/file.txt'
with open(filepath) as infile:
seq = infile.readlines()
# This will bring in the sequence, but if its split up on multiple lines
# (like if its cut off at every 50 bp), then you'll want to piece it back
# together, so you don't miss any ATG's.
seq = ''.join([line.strip() for line in seq.split()])
ATG_count = seq.count('ATG')
print(ATG_count)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.