Regex - Parse string by removing characters

Question

I am trying to parse the path part of a url.

The input, is a string such as site/whatever% ^&*/page/to-days_date// which I would like to convert into site/whatever/page/to-days_date

Things to remove would be anything that is not one of the following:

lower or upper case letter
digit / number
dash
underscore

Answer 1

Just add /+$ with a pipe( | ) with your existing regex. It means match any number(starting from 1) of / from the end of input. So it will work for / // or ///// at the end of the input.

myString = '''blog/whatever%  ^&*/page/to-days_date//'''
print re.sub(r'/+$|[^a-zA-Z0-9_\-\/]+', '', myString)
               ^^^ here

Regex - Parse string by removing characters

Question

1 answers

solution1
1 ACCPTED 2014-04-05 14:35:34

Regex - Parse string by removing characters

Question

1 answers

solution1 1 ACCPTED 2014-04-05 14:35:34

solution1
1 ACCPTED 2014-04-05 14:35:34