Variable not assigning value in python in while loop

Question

The following code doesn't seems to work. I want the ini variable to increment, and logically the code seems to do so. But, this doesn't work.

def refinexml(xml):
links = []
ini = 0
while xml[ini:].find('<loc>') != -1:
    links.append(xml[xml[ini:].find('<loc>') + 5:xml[ini:].find('</loc>')])
    ini = xml[ini:].find('</loc>')
    print ini
return links

Answer 1

When you slice xml with xml[ini:] , you're getting just the end of it, meaning that find() is returning the position of the substring in that slice of xml , not all of it. For example, let xml be this:

<loc> blarg </loc> abcd <loc> text </loc>

Now, find('<loc>') returns 0 . ini is 0 , so you capture " blarg " and find('</loc>') returns 12 . ini is set to 12 . On the next iteration of the loop, find('<loc>') finds the second "<loc>" . You now capture " text " . This is where it goes wrong. You slice xml at ini to get "</loc> abcd <loc> text </loc>" . You call find('<loc>') on that slice, which finds the second "<loc>" in xml , which is the first occurrence of that substring in the slice. The problem is that the index of that occurrence in the slice is 12 , not 24 , which is what you want. You're missing the first ini characters in the string.

Fortunately, you know how many characters short you are. You need to add ini , which you can do like this:

ini = ini + xml[ini:].find('</loc>')

That, of course, can be shortened to this:

ini += xml[ini:].find('</loc>')

You can fix your problem by adding a single character.

As mentioned in the comments, though, you should really use an XML parser.

Answer 2

@KSFT explained this very well. I'll just point out you can eliminate a lot of redundant invocations of find() in your code using something like this:

def refinexml(xml):
    links = []

    start = xml.find('<loc>')
    while start != -1:
        start += 5
        end = xml.find('</loc>', start)
        links.append(xml[start:end].strip())
        start = xml.find('<loc>', end + 6)
    return links

But, really, you should just use an XML parser, as even this code makes some potentially dangerous assumptions.

Variable not assigning value in python in while loop

Question

2 answers

solution1
1 ACCPTED 2015-03-18 22:27:51

solution2
1 2015-03-18 22:32:51

Variable not assigning value in python in while loop

Question

2 answers

solution1 1 ACCPTED 2015-03-18 22:27:51

solution2 1 2015-03-18 22:32:51

solution1
1 ACCPTED 2015-03-18 22:27:51

solution2
1 2015-03-18 22:32:51