简体   繁体   中英

Keep redirecting old realurl urls after migrating to TYPO3 9+

I would like to use the realurl memory of expired url to generate 301 for sites upgraded to TYPO3 9+ and avoid 404.

For example, before TYPO3 9, fetching /my-old-page redirected to /my-new-page , because /my-old-page was still in the realurl database table. Now, since the migration to TYPO3 9, fetching /my-old-page throws a 404.

TYPO3 9 ships an upgrade wizard which transforms realurl pagepath/aliases into slugs, but does not transforms realurl's expired pagepath/aliases into sys_redirect .

What would be the best strategy to keep the realurl memory of redirects:

  • Migrate all expired url/alias to sys_redirect? This can leads to a big sys_redirect table, with performance issues
  • Run a middleware after the RedirectHandler that searches for expired url and triggers a 301 if found? This will make an extra db query for each request.
  • Create a PageNotFoundHandler which searches for expired url if page is not found? TYPO3 allows only one ErrorHandler per status code so it can be an issue
  • List the redirects in the .htaccess

By "best strategy" I mean:

  • the performance could be important (I have more than 10,000 expired urls)
  • if possible the redirects should be maintainable by an editor (like sys_redirect)

Thanks for your insights!

My second solution (which I am using - slightly modified - in production) is with TYPO3:

  • create a page error handler based on PageErrorHandlerInterface for 404. Check in the realurl table for the URL. If you have a hit, redirect to the new URL.
  • if there is no hit, fall back to what you would usually do, eg display error page.

This has the following advantages (to TYPO3 redirects extension):

  • It is only fired up on 404, not on every page.
  • also, you don't have to migrate your redirects to sys_redirects, you can use the old realurl table as is.

Repository\PathMappingRepository:

  public function findPageidForPathFromRealurl(string $path, int $languageId) : int
  {
        $path = ltrim($path, '/');

        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)->getQueryBuilderForTable('tx_realurl_pathdata');
        $uid = $queryBuilder->select('tx_realurl_pathdata.page_id')
            ->from('tx_realurl_pathdata')
            ->join(
                'tx_realurl_pathdata',
                'pages',
                'p',
                $queryBuilder->expr()->eq('tx_realurl_pathdata.page_id',$queryBuilder->quoteIdentifier('p.uid'))
            )
            ->where(
                $queryBuilder->expr()->like('tx_realurl_pathdata.pagepath', $queryBuilder->createNamedParameter($path)),
                $queryBuilder->expr()->eq('tx_realurl_pathdata.language_id', $queryBuilder->createNamedParameter($languageId, \PDO::PARAM_INT)),
                $queryBuilder->expr()->eq('p.sys_language_uid', $queryBuilder->createNamedParameter($languageId, \PDO::PARAM_INT))
            )
            ->orderBy('tx_realurl_pathdata.uid', 'DESC')
            ->execute()
            ->fetchColumn(0);
        $this->logger->debug("findPageidForPathFromRealurl: path=$path language=$languageId returns $uid");
        return (int)$uid;
  }

For the following, I am assuming you use Apache Webserver and have access to the webserver configuration under /etc/apache2, for example.


I don't have any numbers but I assume redirects you handle in the webserver are more efficient than firing up PHP and TYPO3. Disadvantage is the redirects get evaluated also for static assets (unless handled elsewhere, eg cdn). Also, this cannot be maintained by the editors. But, if you are migrating from realurl, for example, you can use this solution via Apache as temporary solution and take it down after a while.

However, this can get unmaintainable and quite ugly if you have a lot of redirects.

Sites I have seen had often accumulated redirects over the years, often happily mixing RewriteRule, Redirect (or redirect), RedirectMatch and RewriteCond thrown in for good measure. To keep that nice and clean I have 2 suggestions (which have both been used in sites I maintained):

  1. Maintain the redirects in the configuration management system (eg angular, SiteStack). Do not write the redirect statements there, but just add the URLs and let your states (or whatever the CM calls them) write them for you

  2. Use RewriteMap and a file consisting of the URLs.

For both solutions, you usually have redirects of (at least) 2 types:

  • exact redirects, eg you want to redirect /abc/def to /new/def, but not for example /abc/def/subpage
  • regex or wildcard redirect, eg you want to redirect /abc/* to /new/*

Both can be handled with appropriate RewriteRule statements, but they look differently. For solutions 1 and 2 you need to handle these separately.

Example 1 (regex redirect):

RewriteRule /?abc/(.*)? /new$1 [R=307,L]

Example 2 RewriteMap:

/etc/apache2/sites-available/mysite.conf

RewriteEngine on
RewriteMap exactredirects "txt:/etc/apache2/redirects/exactredirects.txt"
RewriteRule "^(.*)$" "${exactredirects:$1|/404}" [R=307,L]

/etc/apache2/redirects/exactredirects.txt:

/abc.txt /def.txt

Recommendations:

  • put the Apache configuration and the redirect files in version control
  • be careful with 301 (permanent). Permanent redirect means permanent. As this is handled in the client, there is no way for you to undo this. Use only 301 if you are sure.
  • You often see recommendations to use .htaccess. You can use this instead of putting it in the Apache config. But if you have full control of the Apache configuration you don't need the .htaccess and the documentation recommends to not use .htaccess at all unless you need it. There is a big disadvantage (apart from performance considerations): If you make a mistake in .htaccess you can take your sever down. If you make the change in the Apache configuration, you can do a service apache2 reload (which aborts on error) or an apachectl configtest . (Or even better your CM does this for you before the states are executed).
  • about using RewriteRule vs. Redirect : You can do a lot with both and or its variants such as RedirectMatch but RewriteRule is generally more powerful the other may be faster. Ideally use one or the other. See also "When not to use mod_rewrite" .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM