简体   繁体   中英

regex challenge for htaccess 301 redirect

I have a number of similar Tag and Category URLs that I would like to redirect in one clean regex statement (if possible). Example URLs are:

 table, th, td { border: 1px solid black; }
 <table style="width:100%"> <tr> <th>URL</th> <th>Redirect To</th> </tr> <tr> <td>/category/categoryname-2/</td> <td>/blog/category/categoryname/</td> </tr> <tr> <td>/category/categoryname-2/page/3/</td> <td>/blog/category/categoryname/</td> </tr> <tr> <td>/category/categoryname/</td> <td>/blog/category/categoryname/</td> </tr> <tr> <td>/category/categoryname/page/3/</td> <td>/blog/category/categoryname/</td> </tr> <tr> <td>/tag/tagname-2/</td> <td>/blog/tag/tagname/</td> </tr> <tr> <td>/tag/tagname-2/page/3/</td> <td>/blog/tag/tagname/</td> </tr> <tr> <td>/tag/tagname/</td> <td>/blog/tag/tagname/</td> </tr> <tr> <td>/tag/tagname/page/3/</td> <td>/blog/tag/tagname/</td> </tr> <tr> <td>/blog/page/5/</td> <td>/blog/</td> </tr> </table>

Notice some of the tag and category names have "-2" at the end of the name, which I would like removing.

I have had a good attempt at doing this but am not getting very far; the -2 piece is stumping me unfortunately, hence turning to the expertise here on SO.

I think I've got it. I think this covers all category and tag redirects. *'ve explained what I think it's doing, but could be mistaken and perhaps have the solution by accident as much as design.

Also it's hard to fully test due to browser caching and CDN caching, but seems to work when incognito:

RedirectMatch 301 ^/(category|tag)/(([a-z]+)|([a-z]+-[a-z]+))(?:-2)?/(?:page/.*)?$ /blog/$1/$2/

(category|tag) matches the words category or tag . This returns $1 (group 1)

(([a-z]+)|([a-z]+-[a-z]+))(?:-2)?

Some categories have a - in the name; therefore I have to handle the -2 (which we want rid of) and genuine - (which we want to keep).

(([az]+)|([az]+-[az]+)) - This returns group $2 . The | in the middle says either return what is before the |or what is after.

([az]+) - This finds the first unbroken text block, ie stops when it hits a number or punctuation (a - in this case).

([az]+-[az]+) - This finds the unbroken string, then a hyphen, then another unbroken string. This accounts for tag names with a hyphen. If there isn't a text block after the hyphen, this would return nothing...which we want to happen.

(?:-2)? This says there might be a -2. The question mark at the end signifies there might be but it's fine if there isn't. The ?: inside the bracket says to ignore the -2 if it's there.

(?:page/.*)? Looks for the word "page" and anything after that and, if found, puts in an "ignore" group. The question mark at the end means this doesn't need to be there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM