简体   繁体   中英

Retrieving the position of a a substring present in a string

I am trying to get the position of the letter in a word, which is located in a list.

In the code , k is a list containing the parts of "ATCGCATCG" in 3 pieces, as "ATC", "GCA" and "TCG". What I want to have is, for each of them, to retrieve the first and last position. In this, ATC should have 1 and 3 , as A is the first and C is the 3rd. Therefore, for GCA , it should be 4 and 6, and so on.

So, The output should look like this:

PART1    ATC  1 3 
PART2    GCA  4 6
PART3    TCG  7 9

However what I am able to get is :

PART1    ATC  0 0 
PART2    GCA  1 2
PART3    TCG  2 4

The code producing this output is :

def separate(string,n):
    k = [string[i:i+n] for i in range(0, len(string),n)]
    yield k
    i=1
    for element in k:
                    print 'PART' + str(i) + '\t' + element + '\t' + str(int(k.index(element))) + str(int((k.index(element)) + int(k.index(element)))) 
                    i=i+1


for it in list((separate("ATCGCATCG", n =3))):
        print it

I would appreciate if you can show me an option.

Thanks!

IIUC, I think you're overcomplicating things. Just build your strings in a loop and yield.

def foo(string, n):
    c = 1
    for i in range(0, len(string), n):
        yield '\t'.join(['PART{}'.format(c), string[i : i + n], str(i + 1), str(i + n)])
        c += 1

for i in foo("ATCGCATCG", 3):
     print(i)

PART1   ATC 1   3
PART2   GCA 4   6
PART3   TCG 7   9
def separate(string,n):
    k = [string[i:i+n] for i in range(0, len(string),n)]
    current = string
    start = 0
    end = 0
    for i, element in enumerate(k):
        start = end + current.index(element) + 1
        end = start + len(element) - 1
        current = string[end:]
        print("PART{i}\t{el}\t{s} {e}".format(i=i, el=element, s=start, e=end))

separate("ATCGCATCG", n=3)

Output:

PART0   ATC 1 3
PART1   GCA 4 6
PART2   TCG 7 9

since the string for each part has a fixed length, I think you can try this:

def separate(string,n):
    k = [string[i:i+n] for i in range(0, len(string),n)]
    yield k
    for curr_index in range(len(k)):
        element = k[curr_index]
        curr = curr_index * n + 1
        print ('PART' + str(curr_index + 1) + '\t' + element + '\t' + str(curr) + str(curr + n - 1))

for it in list((separate("ATCGCATCG", n =3))):
        print (it)

It takes the curr_index of the element that is currently being iterated through in the for loop and uses it to calculate the position of the string in the original text. Hope this helps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM