简体   繁体   中英

Is there a python function to replace absolute urls with relative url paths?

I'm crawling a website and saving selected pages for offline browsing. Of course the links between pages are broken so I'd like to do something about that. Is there a somewhat simple way in python to relativize url paths like this without reinventing the wheel:

/folder1/folder2/somepage.html    becomes---->   folder2/somepage.html
/folder1/otherpage.html           becomes---->   ../otherpage.html

I understand the function would need two url paths to determine the relative path as the linked resource's path is relative to the page it appears on.

the newer pathlib (only briefly mentioned in the duplicate) also works with urls:

from pathlib import Path

abs_p = Path('https://docs.python.org/3/library/pathlib.html')
rel_p = abs_p.relative_to('https://docs.python.org/3/')
print(rel_p)  # library/pathlib.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM