python urljoin not finding the absolute path

Question

I'm trying to get the absolute path but I dont get the correct result. This is I'm trying:

Given I have this html page url:

url1 = 'build/en/index.html'

and I have this relative path in the file:

url2  = '/pub-assets/css/indexen.css'

I'm doing:

urljoin(url1, url2)

So I should get build/pub-assets/css/indexen.css

but I don't get what is expected. Any suggestion much appreciated.

Answer 1

If your url1 is a file (instead of directory), you should modify the path by using urlparse and ParseResult._replace to modify the result.

from urlparse import urlsplit

url1 = 'https://example.com/en/index.html'
url2  = 'pub-assets/css/indexen.css'

p = urlsplit(url1).path
new_path = p[:p.rfind('/') + 1] + url2    #Gets the last directory and appends url
joined = urlsplit(url1)._replace(path=new_path)
print joined.geturl()  #Outputs https://example.com/en/pub-assets/css/indexen.css

This is assuming that url1 is an absolute path and url2 is a relative path.

Answer 2

Python 3.6.1:

>>> u1 = 'https://example.com/en/index.html'
>>> u2 = 'pub-assets/css/indexen.css'
>>> import urllib.parse
>>> urllib.parse.urljoin(u1, u2)
'https://example.com/en/pub-assets/css/indexen.css'

Python 2.7.14:

>>> u1 = 'https://example.com/en/index.html'
>>> u2 = 'pub-assets/css/indexen.css'
>>> import urlparse
>>> urlparse.urljoin(u1, u2)
'https://example.com/en/pub-assets/css/indexen.css'

Note the changed import. I would double-check your Python version, import statement, and perhaps post more of your program.

python urljoin not finding the absolute path

Question

2 answers

solution1
0 2018-04-13 21:22:55

solution2
0 2018-04-13 21:53:58

python urljoin not finding the absolute path

Question

2 answers

solution1 0 2018-04-13 21:22:55

solution2 0 2018-04-13 21:53:58

solution1
0 2018-04-13 21:22:55

solution2
0 2018-04-13 21:53:58