简体   繁体   中英

Which should I be using: urlparse or urlsplit?

Which URL parsing function pair should I be using and why?

Directly from the docs you linked yourself :

urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)
This is similar to urlparse() , but does not split the params from the URL. This should generally be used instead of urlparse() if the more recent URL syntax allowing parameters to be applied to each segment of the path portion of the URL (see RFC 2396) is wanted.

As the document says
urlparse.urlparse returns 6-tuple(with additional parameter tuple)
urlparse.urlsplit returns 5-tuple

Attribute |Index | Value | Value if not present
params | 3 | Parameters for last path element | empty string


FYI: According to RFC2396 , parameter in URL specification

Extensive testing of current client applications demonstrated that the majority of deployed systems do not use the ";" character to indicate trailing parameter information, and that the presence of a semicolon in a path segment does not affect the relative parsing of that segment. Therefore, parameters have been removed as a separate component and may now appear in any path segment. Their influence has been removed from the algorithm for resolving a relative URI reference.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM