简体   繁体   中英

template url reversal escaping surt arguments

I'm having an issue where the template url reversal is escaping colon and parenthetical characters. I want these characters to remain unescaped in the anchor tag's href attribute. It used to behave this way when I was in django 1.3, but upgrading to 1.6, I notice that this does not behave as I want.

What I have:

surt = 'http://(gov/'
browse_domain = 'gov'
... in template ...
<a href="{% url 'nomination.views.url_surt' project.project_slug surt %}">{{ browse_domain }}</a>

This yields:

<a href="/nomination/eth2008/surt/http%3A//%28gov/">gov</a>

As you can see, the colon : and left parenthetical ( characters are being escaped in the url href attribute. I don't want that.

What I want:

surt = 'http://(gov/'
browse_domain = 'Gov'
... in template ...
<a href="{% url 'nomination.views.url_surt' project.project_slug surt %}">{{ browse_domain }}</a>

This yields:

<a href="/nomination/eth2008/surt/http://(gov/">gov</a>

Anyone know how to keep these characters from escaping when I'm reversing URLs in my anchor tag?

NOTE: The below answer is wrong. urllib.quote(safe=':()') will indeed keep those safe characters unescaped. Something else is happening in django to cause this problem and I still don't know where it is.

In Django 1.6, any url reversal in the template will first pass through iri_to_uri() before it is rendered to HTML. There is no override for this in the template call to url reverse {% url %} as-is.

Notice this bit of italicized text detailing the change.

This is iri_to_uri()

def iri_to_uri(iri):
    """
    Convert an Internationalized Resource Identifier (IRI) portion to a URI
    portion that is suitable for inclusion in a URL.

    This is the algorithm from section 3.1 of RFC 3987.  However, since we are
    assuming input is either UTF-8 or unicode already, we can simplify things a
    little from the full method.

    Returns an ASCII string containing the encoded result.
    """
    # The list of safe characters here is constructed from the "reserved" and
    # "unreserved" characters specified in sections 2.2 and 2.3 of RFC 3986:
    #     reserved    = gen-delims / sub-delims
    #     gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
    #     sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
    #                   / "*" / "+" / "," / ";" / "="
    #     unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
    # Of the unreserved characters, urllib.quote already considers all but
    # the ~ safe.
    # The % character is also added to the list of safe characters here, as the
    # end of section 3.1 of RFC 3987 specifically mentions that % must not be
    # converted.
    if iri is None:
        return iri
    return urllib.quote(smart_str(iri), safe="/#%[]=:;$&()+,!?*@'~")

At first glance, this might look like : , ( , and ) are safe from escaped hex-encoding because they are passed as 'safe' to urllib.quote() :

_safe_map = {}
for i, c in zip(xrange(256), str(bytearray(xrange(256)))):
    _safe_map[c] = c if (i < 128 and c in always_safe) else '%{:02X}'.format(i)
_safe_quoters = {}

def quote(s, safe='/'):
    # fastpath
    if not s:
        if s is None:
            raise TypeError('None object cannot be quoted')
        return s
    cachekey = (safe, always_safe)
    try:
        (quoter, safe) = _safe_quoters[cachekey]
    except KeyError:
        safe_map = _safe_map.copy()
        safe_map.update([(c, c) for c in safe])
        quoter = safe_map.__getitem__
        safe = always_safe + safe
        _safe_quoters[cachekey] = (quoter, safe)
    if not s.rstrip(safe):
        return s
    return ''.join(map(quoter, s))

If you step through the actual urllib.quote() method as shown above, 'safe' actually means that those characters will be escaped/quoted . Initially, I thought 'safe' meant 'safe-from-quoting'. It caused me a great deal of confusion. I guess they instead mean, 'safe' as 'safe-in-terms-of-sections-2.2-and-2.3-of-RFC-3986'. Perhaps a more elaborately named keyword argument would be prudent, but then again, there's a whole cornucopia of things I find awkward regarding urllib . ‎ಠ_ಠ

After much research, and due to the fact that we don't want to modify Django core methods, our team decided to do some hacky url-construction in the template (the very kind Django docs strongly eschew ). It's not perfect, but it works for our use case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM