简体   繁体   中英

how to migrate to utf8 for a multi-customer shared codeBase in PHP

in our company we use a proprietary CMS, this CMS is in use for almoust a thousand websites and is aged of approximatively 15 years. (it has evolved but is very features rich)

until now whe have always used iso-8859-1 as charset, but whe have the nescessity to use utf-8 for one project.

there are my questions :

  1. do you think that this way to do is good (maintaining one SVN version, automatic conversion to the utf8 search/replace the problematic php functions and do some magic there).
  2. Have you done this kind of evolution before, what is hasardous regarding to you?

TLDR infos :

  • the core of our CMS is centralised (SVN) and deployed (rsynch) on a specific path on each of our servers, this path is in the include path of each websites.
  • The databases are different for each projects (but same structure for the core tables).
  • each website use a document_root holding website's specifics files (media, js, specific PHP code)

in this configuration, we cannot migrate every website on a single time (ie : because there is local code). So I want to make two version of our core : one in iso-8859-1 and the other in utf8. From now I think that I'm going to develop a script on our deployement system, this script, will create a copy utf8 encoded of our core before the rsynch-ing.

My concern is for example about all the "mb_ " stuff in PHP that won't be called, so i'll have to search/replace every php native fucntion to replace it by a custome one that will use the "mb_ " version if nescessary (furthermore, the overloding of those functions must be in the php.ini file, it cannot be defined in the .htaccess of a particulare website ( source ))

PS : sorry for my poor english, it's not my native language :(

So here is how I handled it :

  • each server is either iso-8859-1 or utf-8, and have dedicated conf (ie : mbstring.func_overload)
  • before each deployement a script create a copy of the iso version and convert it to utf8
  • each server get either utf-8 or iso-8859-1 source code

for each individual CMS, a convertion tool exist, it convert filesystem, database charset/collation, configuration (ie : connect to mysql with utf8), ....

for non compatible functions such as utf8_encode, I search/replaced it with a function who is in charge to not call the function if the server is utf8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM