尝试替换西里尔字母符号时编码错误

Question

I have a problem with my string. 我的琴弦有问题。 After the for loop all I get some other symbols instead of my exact cyrillic letters. 在for循环之后，我得到了其他一些符号，而不是确切的西里尔字母。 The goal is to change cyrillic letters: ąčęėįšųūž into this: a1, c2, e1, e2, i1, s2, u1, u2, z2. 我们的目标是将西里尔字母更改为：a1，c2，e1，e2，i1，s2，u1，u2，z2。 I have came up with tihs: 我想到了：

$ltSymbolsArray = array(
      'a1' => 'ą',
      'c2' => 'č',
      'e1' => 'ę',
      'e2' => 'ė',
      'i1' => 'į',
      's2' => 'š',
      'u1' => 'ų',
      'u2' => 'ū',
      'z2' => 'ž'
  );
  $string = 'ąsąžadcę';

  for ($i = 0; $i < strlen($string); $i++) {
    foreach ($ltSymbolsArray as $key => $value) {
      if ($string[$i] == $value) {
        $string[$i] = $key;
      }
    }
  }

It looks like a simple solution, but I can't handle the encoding. 它看起来像一个简单的解决方案，但是我无法处理编码。 Encoding is a mystery for me so I would really appreciate any help on this problem. 编码对我来说还是个谜，因此，我非常感谢您对此问题的任何帮助。

Answer 1

You can't simply iterate over a unicode string and expect, that each iteration will receive a full character, if a single character really goes over more than one byte. 您不能简单地遍历unicode字符串并期望，如果单个字符确实超过一个字节，则每次迭代都将接收完整字符。

Use preg_split in combination with the unicode modifier to split your string into valid unicode characters. 结合使用preg_split和unicode修饰符，可以将字符串拆分为有效的unicode字符。 Then use the result of this to replace the characters in the original string. 然后使用此结果替换原始字符串中的字符。

You could also use one of the multibyte regex functions, such as mb_ereg_replace 您还可以使用多字节正则表达式功能之一，例如mb_ereg_replace

尝试替换西里尔字母符号时编码错误

问题描述

1 个解决方案

解决方案1
0 已采纳 2013-11-29 23:59:40

尝试替换西里尔字母符号时编码错误

问题描述

1 个解决方案

解决方案1 0 已采纳 2013-11-29 23:59:40

解决方案1
0 已采纳 2013-11-29 23:59:40