Python Regular Expression Findall

Question

I'm trying to find up to the top level domain information.

If I were to search " https://testwebsite.com.au/folders/viewforum.php?f=1556n " I only want my expression to find " https://testwebsite.com.au "

I'm using the following expression:

urlRegex = re.compile(r'''( (https?|sftp|ftp|file)://[-a-zA-Z0-9+&@#/%?
           =~_|!:,.;'"*$()]*[a-zA-Z0-9+&@#/%=~_|]   )''', re.VERBOSE)

Answer 1

If you want to be strict and correct, use a real URL parser. If you're looking for something quick and dirty that will work for 99% of the URLs you'll find, how about:

urlRegex = re.compile(r'([a-zA-Z]+://[^/\s]+)')

Python Regular Expression Findall

Question

1 answers

solution1
0 2015-09-19 22:01:18

Python Regular Expression Findall

Question

1 answers

solution1 0 2015-09-19 22:01:18

solution1
0 2015-09-19 22:01:18