[英]Split string of emoji characters by split function with regex
I want to use split function of Javascript to split string of emoji characters. 我想使用Javascript的分割功能来分割表情符号字符的字符串。 In stackoverflow there are many question like that, but I cannot find any completed solutions.
在stackoverflow中,有很多类似的问题,但是我找不到任何完整的解决方案。 So I do it by my own way:
所以我用自己的方式做:
a) Use split function with regex. a)与正则表达式一起使用拆分功能。
b) Split emoji characters by regex unicode matches: from \? to \? and from \? to \?. b)通过正则表达式unicode匹配拆分表情符号字符:从\\ uD800到\\ uDBFF,从\\ uDC00到\\ uDFFF。
c) In this regex, exclude zero-with-joiner (\) and variation selector (\️) characters. c)在此正则表达式中,排除连接符零(\\ u200D)和变体选择器(\\ uFE0F)字符。 So I wrote as follows:
所以我写如下:
var p = '👦🏼👧🏼👩🏼👧🏾👧🏿👩👩👧👧👭👫👨❤️💋👨';
and split it: 并拆分:
var split = p.split(/(?![\u200D\uFE0F])([\uD800-\uDBFF][\uDC00-\uDFFF])/);
But the result is wrong :( 但是结果是错误的:(
["", "👦", "", "🏼", "", "👧", "", "🏼", "", "👩", "", "🏼", "", "👧", "", "🏾", "", "👧", "", "🏿", "", "👩", "", "👩", "", "👧", "", "👧", "", "👭", "", "👫", "", "👨", "❤️", "💋", "", "👨", ""]
Did I use the excluding selector for regex right? 我对正则表达式使用了排除选择器吗? If right, the error caused by my idea?
如果正确,错误是由我的想法引起的? The expected result need to be: ["👦🏼", "👧🏼", "👩🏼", "👧🏾", "👧🏿", "👩👩👧👧", "👭", "👫", "👨❤️💋👨"]
预期结果必须为:[“👦🏼”,“👧🏼”,“👩🏼”,“👧🏾”,“👧🏿”,“ 👩👩👧👧”,“👭”,“👫” ,“ 👨❤️💋👨”]
=== ===
I want to update info. 我想更新信息。 I solved this problem for my site: https://www.emojionline.org .
我为我的网站https://www.emojionline.org解决了这个问题。 You can test.
您可以测试。 I just use a dictionary that hold all emojis and I use the replace function to replace every emoji by |emoji|.
我只是使用容纳所有表情符号的字典,并使用替换功能将所有表情符号替换为| emoji |。 And I can split string emoji by symbol |.
而且我可以用符号|分割字符串表情符号。 That works well :)
效果很好:)
I extended the emoji-regex by Mathias Bynens a bit with a [\?-\?][\?-\?](?:[\\️][\?-\?][\?-\?]){2,}
alternative. 我用
[\?-\?][\?-\?](?:[\\️][\?-\?][\?-\?]){2,}
扩展了Mathias Bynens的emoji-regex [\?-\?][\?-\?](?:[\\️][\?-\?][\?-\?]){2,}
替代。 It matches a common 2-byte emoji followed with 2 or more sequences (this can be controlled with the {2,}
limiting quantifier) of either zero-width joiner or variation selector and again the common 2-byte emoji char. 它匹配一个普通的2字节表情符号,后跟两个或多个零宽度连接符或变体选择器的序列(可以用
{2,}
限制量词控制),再匹配两个普通的2字节表情符号char。
Without the alternative, the results are [ '👦🏼','👧🏼','👩🏼','👧🏾','👧🏿','👩👩👧','👧','👭','👫','👨❤️💋👨' ]
. 如果没有其他选择,结果为
[ '👦🏼','👧🏼','👩🏼','👧🏾','👧🏿','👩👩👧','👧','👭','👫','👨❤️💋👨' ]
。
var p = 'my family 👦🏼👧🏼👩🏼👧🏾👧🏿👩👩👧👧👭👫👨❤️💋👨 here'; var rx = /([\?-\?][\?-\?](?:[\\️][\?-\?][\?-\?]){2,}|\?\?(?:\(?:(?:\?\?\)?\?\?|(?:\?\?\)?\?\?)|\?[\?-\?])|\?\?\(?:\?\?\)?\?\?\\?\?|\?\?\(?:\?\?\)?\?\?\(?:\?[\?\?])|\?\?\️\\?\?|(?:\?[\?\?\?]|\?[\?\?\?\?\?\?\?\?\?-\?\?\?\?\?\?-\?]|\?[\?\?-\?\?\?\?-\?])(?:\?[\?-\?])\[\♀\♂]\️|\?\?(?:\?[\?-\?])\(?:\?[\?\?\?\?\?\?\?]|\?[\?\?\?\?\?\?])|(?:\?[\?\?\?]|\?[\?\?\?\?\?\?\?\?\?\?-\?\?\?\?\?\?-\?]|\?[\?\?-\?\?-\?\?-\?])\[\♀\♂]\️|\?\?\?\?|\?\?\?\?|\?\?\?\?|\?\?(?:\?[\?\?\?\?\?\?\?])|\?\?(?:\?[\?\?\?\?\?])|\?\?(?:\?[\?\?\?\?-\?\?-\?\?\?-\?])|(?:\⛹|\?[\?\?]|\?\?)(?:\️\[\♀\♂]|(?:\?[\?-\?])\[\♀\♂])\️|(?:\?\?\️\\?\?|\?\?(?:\?[\?-\?])\[\⚕\⚖\✈]|\?\?\[\⚕\⚖\✈]|\?\?(?:(?:\?[\?-\?])\[\⚕\⚖\✈]|\[\⚕\⚖\✈]))\️|\?\?(?:\?[\?\?-\?\?-\?])|\?\?\(?:\?[\?\?\?\?\?\?\?]|\?[\?\?\?\?\?\?]|\❤\️\(?:\?\?\(?:\?[\?\?])|\?[\?\?]))|\?\?(?:\?[\?-\?\?\?\?-\?\?])|\?\?(?:\?[\?\?\?\?])|\?\?(?:\?[\?\?\?\?\?\?])|\?\?(?:\?[\?-\?\?\?\?])|[#\\*0-9]\️\⃣|\?\?(?:\?[\?\?\?-\?\?-\?\?-\?\?\?\?\?])|\?\?(?:\?[\?-\?\?\?\?\?\?-\?\?\?\?])|\?\?(?:\?[\?\?\?])|\?\?(?:\?[\?\?-\?\?-\?\?-\?\?\?])|\?\?(?:\?[\?\?\?\?\?\?\?])|\?\?(?:\?[\?\?\?-\?\?\?\?\?\?\?\?])|\?\?\?\?\?\?(?:\?\?\?\?\?\?|\?\?\?\?\?\?|\?\?\?\?\?\?)\?\?|\?\?(?:\(?:\❤\️\(?:\?\?\)?\?\?|(?:(?:\?[\?\?])\)?\?\?\\?\?|(?:(?:\?[\?\?])\)?\?\?\(?:\?[\?\?])|\?[\?\?\?\?\?\?\?]|\?[\?\?\?\?\?\?])|(?:\?[\?-\?])\(?:\?[\?\?\?\?\?\?\?]|\?[\?\?\?\?\?\?]))|\?\?(?:\?[\?-\?\?-\?\?-\?\?\?-\?])|\?\?(?:\?[\?\?-\?\?\?\?\?\?\?\?])|\?\?(?:\?[\?\?])|\?\?(?:\?[\?-\?\?-\?\?-\?])|\?\?(?:\?[\?\?\?\?-\?\?-\?\?\?\?\?\?])|\?\?(?:\?[\?\?\?-\?\?-\?\?-\?\?\?])|\?\?(?:\?[\?\?\?\?\?\?\?])|\?\?(?:\?[\?\?\?\?\?\?-\?])|\?\?(?:\?[\?\?])|(?:\⛹|\?[\?\?]|\?\?)(?:\?[\?-\?])|(?:\?[\?\?\?]|\?[\?\?\?\?\?\?\?\?\?-\?\?\?\?\?\?-\?]|\?[\?\?-\?\?\?\?-\?])(?:\?[\?-\?])|(?:[\☝\✊-\✍]|\?[\?\?\?]|\?[\?\?\?-\?\?\?\?\?\?-\?\?\?\?\?\?\?\?\?\?\?\?\?\?\?]|\?[\?-\?\?\?\?-\?\?-\?])(?:\?[\?-\?])|\?\?(?:\(?:(?:(?:\?[\?\?])\)?\?\?|(?:(?:\?[\?\?])\)?\?\?)|\?[\?-\?])|(?:[\☝\⛹\✊-\✍]|\?[\?\?-\?\?\?-\?]|\?[\?\?\?-\?\?-\?\?\?-\?\?\?-\?\?-\?\?\?\?\?\?\?\?\?-\?\?-\?\?\?-\?\?\?]|\?[\?-\?\?\?\?\?-\?\?\?\?-\?])(?:\?[\?-\?])?|(?:[\⌚\⌛\⏩-\⏬\⏰\⏳\◽\◾\☔\☕\♈-\♓\♿\⚓\⚡\⚪\⚫\⚽\⚾\⛄\⛅\⛎\⛔\⛪\⛲\⛳\⛵\⛺\⛽\✅\✊\✋\✨\❌\❎\❓-\❕\❗\➕-\➗\➰\➿\⬛\⬜\⭐\⭕]|\?[\?\?\?\?-\?\?-\?\?\?\?\?-\?\?-\?\?\?\?-\?\?-\?\?-\?\?-\?\?-\?\?-\?\?-\?\?\?-\?]|\?[\?-\?\?\?-\?\?-\?\?-\?\?-\?\?\?\?\?\?-\?\?-\?\?\?-\?\?\?\?-\?]|\?[\?-\?\?-\?\?-\?\?-\?\?-\?\?-\?\?\?-\?])|(?:[#\\*0-9\\xA9\\xAE\‼\⁉\™\ℹ\↔-\↙\↩\↪\⌚\⌛\⌨\⏏\⏩-\⏳\⏸-\⏺\Ⓜ\▪\▫\▶\◀\◻-\◾\☀-\☄\☎\☑\☔\☕\☘\☝\☠\☢\☣\☦\☪\☮\☯\☸-\☺\♀\♂\♈-\♓\♠\♣\♥\♦\♨\♻\♿\⚒-\⚗\⚙\⚛\⚜\⚠\⚡\⚪\⚫\⚰\⚱\⚽\⚾\⛄\⛅\⛈\⛎\⛏\⛑\⛓\⛔\⛩\⛪\⛰-\⛵\⛷-\⛺\⛽\✂\✅\✈-\✍\✏\✒\✔\✖\✝\✡\✨\✳\✴\❄\❇\❌\❎\❓-\❕\❗\❣\❤\➕-\➗\➡\➰\➿\⤴\⤵\⬅-\⬇\⬛\⬜\⭐\⭕\〰\〽\㊗\㊙]|\?[\?\?\?\?\?\?\?\?-\?\?-\?\?\?\?\?\?-\?\?\?\?-\?\?-\?\?\?\?-\?\?-\?\?-\?\?-\?]|\?[\?-\?\?-\?\?-\?\?-\?\?\?\?-\?\?\?-\?\?\?\?\?\?\?\?\?\?\?-\?\?-\?\?-\?\?\?\?\?\?\?-\?\?-\?\?-\?\?-\?\?\?\?\?\?-\?]|\?[\?-\?\?-\?\?-\?\?-\?\?-\?\?-\?\?\?-\?])\️)/; var res = p.split(rx).filter(Boolean); document.body.innerHTML = res;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.