简体   繁体   English

如何在 PHP 5.3 中将 Emojis 转换为它们各自的 HTML 代码实体?

[英]How to convert Emojis to their respective HTML code entities in PHP 5.3?

I need to convert the Emojis (eg 😀 ) in strings to their respective HTML code entities (eg 😀 ) on a PHP 5.3 site.我需要在 PHP 5.3 站点😀字符串中的表情符号(例如😀 )转换为它们各自的 HTML 代码实体(例如😀 )。

I need to do this so that user input gets properly stored in a legacy script MySQL Database to later display properly when shown back to the user.我需要这样做,以便用户输入正确存储在旧脚本 MySQL 数据库中,以便稍后在向用户显示时正确显示。 When attempting to save Emojis directly from user input, they are incorrectly saved as ?尝试直接从用户输入中保存表情符号时,它们被错误地保存为? in its Database.在其数据库中。 This legacy script does not support utf8mb4 in MySQL ( this solution failed) and all attempts at converting its Database, Tables, and Columns to utf8mb4 have not solved this problem, so the only solution I have left which I already confirmed works is converting user-inputted Emojis in strings to their respective HTML code entities to correctly store those entities as-is in the Database so that they display correctly as Emojis when retrieved since modern browsers automatically convert those Emoji entities to Emoji characters.这个遗留脚本在 MySQL 中不支持utf8mb4这个解决方案失败了)并且所有将其数据库、表和列转换为utf8mb4没有解决这个问题,所以我留下的唯一一个我已经确认有效的解决方案是转换用户-将字符串中的表情符号输入到它们各自的 HTML 代码实体中,以将这些实体按原样正确存储在数据库中,以便在检索时正确显示为表情符号,因为现代浏览器会自动将这些表情符号实体转换为表情符号字符。

I have also tried this solution, but it does not work in PHP 5.3, only in 5.4 and above.我也尝试过这个解决方案,但它在 PHP 5.3 中不起作用,只能在 5.4 及更高版本中使用。 (I cannot upgrade to 5.4 on this particular site because the legacy script it depends on only works in 5.3 and cannot be changed or upgraded under any circumstances.) (我无法在此特定站点上升级到 5.4,因为它所依赖的旧脚本仅适用于 5.3,并且在任何情况下都无法更改或升级。)

I have also tried this solution , which works in PHP 5.3, but you can't feed it a string, only the specific Emoji, so it does not solve my problem despite working in PHP 5.3.我也试过这个解决方案,它在 PHP 5.3 中工作,但是你不能给它一个字符串,只有特定的表情符号,所以尽管在 PHP 5.3 中工作,它并不能解决我的问题。

I only need the Emojis in a string converted, nothing else.需要转换字符串中的表情符号,别无其他。 (However, if that is not possible, then I suppose I can live with other HTML entities being converted with it, like & to & , but I prefer that not be the case.) (但是,如果这是不可能的,那么我想我可以忍受其他 HTML 实体与它一起转换,例如&& ,但我更喜欢事实并非如此。)

So how can I convert Emojis in strings to their respective HTML code entities in PHP 5.3 such that a string like this & that 😎 gets converted to this & that 😎所以,我怎么能转换成表情符号在字符串中各自的HTML代码实体PHP 5.3,使得像一个字符串this & that 😎被转换到this & that 😎 ? ?

The code to detect the emoji bypasses stackoverflow's character limit, so here's a gist instead:检测表情符号的代码绕过了 stackoverflow 的字符限制,所以这里有一个要点:

https://gist.github.com/BarryMode/432a7a1f9621e824c8a3a23084a50f60#file-htmlemoji-php https://gist.github.com/BarryMode/432a7a1f9621e824c8a3a23084a50f60#file-htmlemoji-php

The entire function is essentially just整个功能本质上就是

preg_replace_callback(pattern, callback, string);

The string is the input where you have emoji that you want to change into html entities.string是您想要更改为 html 实体的表情符号的输入。 The pattern uses regex to find the emoji in the string and then each one is fed into the callback , which is where the conversion happens from emoji to html entity.pattern使用正则表达式查找字符串中的表情符号,然后将每个表情符号输入到callback ,这是从表情符号到 html 实体的转换发生的地方。

In creating this function, htmlemoji() , I combined a few different pieces of code that others had worked on.在创建这个函数htmlemoji() ,我结合了其他人使用过的一些不同的代码段。 Here's some credits:这里有一些功劳:

The callback uses this stackoverflow answer to build each entity. 回调使用此 stackoverflow 答案来构建每个实体。

The pattern was directly ripped from this source on GitHub. 该模式是直接从 GitHub 上的此源中提取的。

I have created a trait for this Which is a mix of the two ideas bellow, it covers missing ones like.我为此创建了一个特征,它是下面两个想法的混合,它涵盖了缺失的那些。 🤩 🤩

How to convert Emojis to their respective HTML code entities in PHP 5.3如何在 PHP 5.3 中将 Emojis 转换为它们各自的 HTML 代码实体

Idea taken from https://gist.github.com/BarryMode/432a7a1f9621e824c8a3a23084a50f60#file-htmlemoji-php and https://github.com/chefkoch-dev/morphoji想法取自https://gist.github.com/BarryMode/432a7a1f9621e824c8a3a23084a50f60#file-htmlemoji-phphttps://github.com/chefkoch-dev/morphoji

A mix of the 2 ideas above.上面两种想法的混合。

trait ConvertEmojis {特质 ConvertEmojis {

/** @var string */
protected static $emojiPattern;

public function convert($str) {

    return preg_replace_callback($this->getEmojiPattern(), array(&$this, 'entity'), $str);
}

protected function entity($matches) {
    return '&#'.hexdec(bin2hex(mb_convert_encoding("$matches[0]", 'UTF-32', 'UTF-8'))).';';
}

/**
 * Returns a regular expression pattern to detect emoji characters.
 *
 * @return string
 */
protected function getEmojiPattern()
{
    if (null === self::$emojiPattern) {
        $codeString = '';

        foreach ($this->getEmojiCodeList() as $code) {
            if (is_array($code)) {

                $first = dechex(array_shift($code));
                $last  = dechex(array_pop($code));
                $codeString .= '\x{' . $first . '}-\x{' . $last . '}';

            } else {
                $codeString .= '\x{' . dechex($code) . '}';
            }
        }

        self::$emojiPattern = "/[$codeString]/u";
    }

    return self::$emojiPattern;
}

/**
 * Returns an array with all unicode values for emoji characters.
 *
 * This is a function so the array can be defined with a mix of hex values
 * and range() calls to conveniently maintain the array with information
 * from the official Unicode tables (where values are given in hex as well).
 *
 * With PHP > 5.6 this could be done in class variable, maybe even a
 * constant.
 *
 * @return array
 */
protected function getEmojiCodeList()
{
    return [
        // Various 'older' charactes, dingbats etc. which over time have
        // received an additional emoji representation.
        0x203c,
        0x2049,
        0x2122,
        0x2139,
        range(0x2194, 0x2199),
        range(0x21a9, 0x21aa),
        range(0x231a, 0x231b),
        0x2328,
        range(0x23ce, 0x23cf),
        range(0x23e9, 0x23f3),
        range(0x23f8, 0x23fa),
        0x24c2,
        range(0x25aa, 0x25ab),
        0x25b6,
        0x25c0,
        range(0x25fb, 0x25fe),
        range(0x2600, 0x2604),
        0x260e,
        0x2611,
        range(0x2614, 0x2615),
        0x2618,
        0x261d,
        0x2620,
        range(0x2622, 0x2623),
        0x2626,
        0x262a,
        range(0x262e, 0x262f),
        range(0x2638, 0x263a),
        0x2640,
        0x2642,
        range(0x2648, 0x2653),
        0x2660,
        0x2663,
        range(0x2665, 0x2666),
        0x2668,
        0x267b,
        0x267f,
        range(0x2692, 0x2697),
        0x2699,
        range(0x269b, 0x269c),
        range(0x26a0, 0x26a1),
        range(0x26aa, 0x26ab),
        range(0x26b0, 0x26b1),
        range(0x26bd, 0x26be),
        range(0x26c4, 0x26c5),
        0x26c8,
        range(0x26ce, 0x26cf),
        0x26d1,
        range(0x26d3, 0x26d4),
        range(0x26e9, 0x26ea),
        range(0x26f0, 0x26f5),
        range(0x26f7, 0x26fa),
        0x26fd,
        0x2702,
        0x2705,
        range(0x2708, 0x270d),
        0x270f,
        0x2712,
        0x2714,
        0x2716,
        0x271d,
        0x2721,
        0x2728,
        range(0x2733, 0x2734),
        0x2744,
        0x2747,
        0x274c,
        0x274e,
        range(0x2753, 0x2755),
        0x2757,
        range(0x2763, 0x2764),
        range(0x2795, 0x2797),
        0x27a1,
        0x27b0,
        0x27bf,
        range(0x2934, 0x2935),
        range(0x2b05, 0x2b07),
        range(0x2b1b, 0x2b1c),
        0x2b50,
        0x2b55,
        0x3030,
        0x303d,
        0x3297,
        0x3299,

        // Modifier for emoji sequences.
        0x200d,
        0x20e3,
        0xfe0f,

        // 'Regular' emoji unicode space, containing the bulk of them.
        range(0x1f000, 0x1f9cf)
    ];
}    

} }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM