简体   繁体   中英

Remove multiple query string parameters from htaccess

I've got a problem with some website URLs which I want htaccess to redirect after removing a few query string parameters, for example:

http://www.mywebsite.com/archive?s=200&dis=default&opt=foo
http://www.mywebsite.com/archive?dis=foo&opt=baz

or

http://www.mywebsite.com/archive?type=default&format=rss
http://www.mywebsite.com/archive?pg=3&format=rss&type=default

I want to save all the parameters except for type , format , dis or opt which are causing a 404 error. I've found a way to remove a single parameter, but I still can't find a regex or something to remove multiple query parameters.

This is my code so far:

RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*)&?view=[^&]+&?(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)&?opt=[^&]+&?(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)&?type=[^&]+&?(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)&?format=[^&]+&?(.*)$
RewriteRule ^/?(.*)$ /$1?%1%2 [R=301,L]

Which doesn't work because it removes just a single parameter and saves the others that are causing errors.

PS As you can see, it should work only on 'archive' page, but that's not a problem :)

UPDATE

This is an URL that I'm testing at the moment:

http://www.mywebsite.com/archive?foo=0&force=0&format=feed&type=rss

Which I want to be like this:

http://www.mywebsite.com/archive?foo=0&force=0

RE-UPDATED

By using collapsar's answer, the server's error_log shows this:

Invalid command '<If', perhaps misspelled or defined by a module not included in the server configuration

Discussion

Unfortunately, the query string portion of urls is excluded from rewriting by default. the RewriteRule directive does not match against the query string portion. Any Query string needs to be appended expressly in the substitution string.

This implies that the rewriting cannot be accomplished without resorting to RewriteCond directives ( fwiw, that is why the previous versions of this answer have been wrong ).

RewriteRule performs the actual rewriting after any one of RewriteCond patterns linked with the OR flag matches. This implies that the set of conditions will not be exhaustively tested.

Solution

Adjust your rule set as follows:

RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])format=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3

RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])opt=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3

RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])type=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3

RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])view=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3

RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*[?])breaktheloop=1(.*)$
RewriteRule .? - [S=1]
RewriteRule ^(.*)$ $1?breaktheloop=1 [QSA,R=301,L]
RewriteRule ^(.*)$ $1?%1%2 [L]

The RewriteCond patterns take into account that improperly escaped urls may be the value of some parameter in the query string. Drop the non-greedy matching modifier (ie. use ^(.*) instead of ^(.*?) ) if you are not concerned about this.

Synopsis

The differences to the OP's original solution are:

  • individual substitution of each offending parameter
  • including the parameter separator in the substitution pattern
  • catering for improperly escaped urls as a parameter value
  • individual rewriting (copy) rule to trigger redirection after any number of substitutions have applied. QSA flag is necessary to keep the sanitized query string.
  • breaktheloop parameter to, well, break the redirection loop.

Documentation

The respective section of the Apache httpd directive docs is the place to find more detailed information.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM