简体   繁体   中英

How does the following for loop work?

def escape_html(s):
    for (i, o) in (("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;")):
        s = s.replace(i , o)
    return s

I haven't seen something like this before.

What does the first line of the for loop mean?

In general, what does the loop do and how does it do it?

Note: s is a string

Please try to explain the full iterative process.

In English:

For each pair of values in the following list of pairs of values, do the stuff in the loop. In this case, (i, o) just means "Assign the values from the pair to variables named i & o."

During the first iteration, i is "&" and o is "&amp;"

Each time through the loop, it replaces occurrences of i with the replacement in o so any "&" in the source text becomes "&amp;", ">" becomes "&gt", etc.

This is pretty straight forward python.

The for loop is unpacking individual items from an iterable. So, for example you could do someting like this:

>>> c = [('a', 'b', 'c'), ('d', 'e', 'f')]
>>> for i, j, k in c:
...     print i, j, k
... 
a b c
d e f

In your case (i, o) are being populated with the values from the tuple of tuples. Instances of i are then replaced with the value of o . This function is replacing html special characters with the entity representing each.

>>> s = 'foo & bar'
>>> s = s.replace('&', '&amp;')
>>> s
'foo &amp; bar'

This function is equivalently doing:

def escape_html(s):
    s = s.replace("&","&amp;")
    s = s.replace(">", "&gt;")
    s = s.replace("<", "&lt")
    s = s.replace('"', "&quot;")
    return s

In lieu of using a proper debugger, try adding some print statements to see what is going on:

def escape_html(s):
    print "ORIGINAL STRING: %s" % (s)
    for (i, o) in (("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;")):
        print "\t(i, o) = ('%s', '%s')" % (i, o)
        s = s.replace(i , o)
        print "\ts = %s" % (s, )
        print
    return s

mystring = """<h3>This is a test</h3><script>alert("I hacked you!");</script>"""
print escape_html(mystring)

OUTPUT

ORIGINAL STRING: <h3>This is a test</h3><script>alert("I hacked you!");</script>
    (i, o) = ('&', '&amp;')
    s = <h3>This is a test</h3><script>alert("I hacked you!");</script>

    (i, o) = ('>', '&gt;')
    s = <h3&gt;This is a test</h3&gt;<script&gt;alert("I hacked you!");</script&gt;

    (i, o) = ('<', '&lt')
    s = &lth3&gt;This is a test&lt/h3&gt;&ltscript&gt;alert("I hacked you!");&lt/script&gt;

    (i, o) = ('"', '&quot;')
    s = &lth3&gt;This is a test&lt/h3&gt;&ltscript&gt;alert(&quot;I hacked you!&quot;);&lt/script&gt;

&lth3&gt;This is a test&lt/h3&gt;&ltscript&gt;alert(&quot;I hacked you!&quot;);&lt/script&gt;

for每对物品的io in对序列(("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;"))replace S的每个实例io在串s

for (i, o) in (("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;")):

i and o are your loop variables. & > < " are the characters to be replaced, and &amp; &gt; &lt; &quot; are the characters to replace them with.

In the first iteration of the loop i = & and o = &amp; In the second iteration i = > and o = &gt; and so on.

The thing you're iterating over is a tuple of tuples (pairs in this case).

So for each iteration of the loop, i gets the first thing, and o gets the second. EG, on the first iteration, i gets & and o gets &.

So then it just keeps creating new strings with i replaced by o.

consider that tuple as tupl =(("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;")) to make it simpler.

So the items of tupl are ("&","&amp;") , (">", "&gt;") , etc

so the for loop becomes:

  for (i,o) in tupl:

what it does is it fetches the items from the tupl one by one are tries to do something like :

(i,o)=("&","&amp;") , or simply i,o=("&","&amp;") , which assigns the '&' to i and &amp; to o in the first iteration and > to i and &gt; to o in second iteration, and so on.

(("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;")) are tuples in a tuple.

Let's reduce it to simpler terms.

for (x, y) in ( ('a', 'b'), ('c', 'd') ):
    print x, y   

This prints the contents of each tuple...

a, b
c, d

Perhaps that clears things up.

(("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;")) is a tuple with 4 elements in it.

The element at index 0 is the tuple ("&","&amp;")

When you say a, b = 0, 1 , python evaluates it the same as (a, b) = (0, 1) , where the variables are assigned corresponding values. That is, a takes the value 0 and b takes the value 1 .

Your for loop effectively loops over the big tuple with 4 elements in it. Since each element is a 2-tuple, you are able to assign their respective values to the two variables i and o

(("&","&amp;"),(">", "&gt;"),("<", "&lt"),('"', "&quot;")) is a 4-tuple with each element being a 2-tuple (for ex., ("&","&amp;") ). A tuple is a fixed-length sequence. You can read up more on them here: http://anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/loopsandtuples.html

The first line is just a for loop over the sequence. The left hand side (before the 'in') takes advantage of python unpacking. It takes the two values of a tuple and distributes them, one into i and the other into o .

In general, for each tuple the for loop replaces the first element of the tuple with the second.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM