Which URL parsing function pair should I be using and why?
urlparse
and urlunparse
, or urlsplit
and urlunsplit
? Directly from the docs you linked yourself :
urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)
This is similar tourlparse()
, but does not split the params from the URL. This should generally be used instead ofurlparse()
if the more recent URL syntax allowing parameters to be applied to each segment of the path portion of the URL (see RFC 2396) is wanted.
As the document says
urlparse.urlparse
returns 6-tuple(with additional parameter tuple)
urlparse.urlsplit
returns 5-tuple
Attribute |Index | Value | Value if not present
params | 3 | Parameters for last path element | empty string
FYI: According to RFC2396 , parameter in URL specification
Extensive testing of current client applications demonstrated that the majority of deployed systems do not use the ";" character to indicate trailing parameter information, and that the presence of a semicolon in a path segment does not affect the relative parsing of that segment. Therefore, parameters have been removed as a separate component and may now appear in any path segment. Their influence has been removed from the algorithm for resolving a relative URI reference.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.