简体   繁体   English

如何从文本中删除使用 xquery 的特殊表情符号

[英]how can i remove special emoji's using xquery from text

I have a $text = "Hello üäö$"我有一个$text = "Hello üäö$"

I wanted to remove just emoji's from the text using xquery .我想使用xquery从文本中删除表情符号。 How can i do that?我怎样才能做到这一点?

Expected result : "Hello üäö$"预期结果:“你好 üäö$”

i tried to use:我尝试使用:

replace($text, '\p{IsEmoticons}+', '')

but didn't work.但没有用。

it just removed smiley's它只是删除了笑脸

Result now: "Hello üäö$" Expected result: "Hello üäö$"现在的结果: “Hello üäö$”预期结果: “Hello üäö$”

Thanks in advance:)提前致谢:)

I outlined the approach in my answer to the original question , which I updated based on your comment asking about how to strip out.我在对原始问题的回答中概述了该方法,我根据您询问如何剥离的评论对其进行了更新。

Quoting from that expanded answer:引用该扩展答案:

The "Emoticons" block doesn't contain all characters commonly associated with "emoji." “表情符号”块不包含通常与“表情符号”相关的所有字符。 For example, (Purple Heart, U+1F49C), according to a site like https://www.compart.com/en/unicode/U+1F49C that lets you look up Unicode character information, is from:例如,(Purple Heart, U+1F49C),根据https://www.compart.com/en/unicode/U+1F49C 之类的网站可以让您查找 Unicode 字符信息,来自:

Miscellaneous Symbols and Pictographs, U+1F300 - U+1F5FF杂项符号和象形文字,U+1F300 - U+1F5FF

This block is not available in XPath or XQuery processors, since it is neither listed in the XML Schema 1.0 spec linked above, nor is it in Unicode block names for use in XSD regular expressions —a list of blocks that XPath and XQuery processors conforming to XML Schema 1.1 are required to support . This block is not available in XPath or XQuery processors, since it is neither listed in the XML Schema 1.0 spec linked above, nor is it in Unicode block names for use in XSD regular expressions —a list of blocks that XPath and XQuery processors conforming to XML Schema 1.1需要支持

For characters from blocks not available in XPath or XQuery, you can manually construct character classes.对于 XPath 或 XQuery 中不可用的块中的字符,您可以手动构造字符类。 For example, given the purple heart character above, we can match it as follows:例如,给定上面的紫心字符,我们可以如下匹配:

 replace("Purple heart", "[🌀-🗿]", "")

This returns the expected result:这将返回预期结果:

 Purple Heart

This approach can be applied to,, or any other character:这种方法可以应用于,或任何其他角色:

  1. Locate the character's unicode block.找到角色的 unicode 块。
  2. Craft your regular expression with the block name (if available in XPath) or character class.使用块名称(如果在 XPath 中可用)或字符 class 来制作正则表达式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM