简体   繁体   English

转换字符串中间的 ASCII 值

[英]Converting ASCII value in the middle of a string

I have a script in PHP that stores values to a MySQL database from a web store.我有一个 PHP 脚本,用于将值从 Web 商店存储到 MySQL 数据库。 The store allows customers to leave a message which can create havoc when they use emojis.该商店允许客户留言,当他们使用表情符号时可能会造成严重破坏。 To prevent these characters from breaking my script I've used FILTER_SANITIZE_STRING and FILTER_FLAG_STRIP_HIGH on all my strings prior to sending them all to MySQL.为了防止破坏我的剧本我已经使用这些字符FILTER_SANITIZE_STRINGFILTER_FLAG_STRIP_HIGH将它们全部发送到MySQL之前我所有的字符串。

This works well except for when I display it again in a Java program I've written I'll have things like "I'm" instead of "I'm" .这很有效,除了当我在我编写的 Java 程序中再次显示它时,我会使用诸如"I'm"而不是"I'm"

Is there a way to have Java find and convert the ASCII values back into characters?有没有办法让 Java 查找 ASCII 值并将其转换回字符?

My current plan of attack is to have a function that takes each relevant string column, examines each word looking for &# , finds the position of the simi-colon after the &# , replaces that value with the corresponding ASCII character, and returns the new string.我目前的攻击计划是有一个函数,它接受每个相关的字符串列,检查每个寻找&#单词,找到&#后面的半冒号的位置,用相应的 ASCII 字符替换该值,并返回新字符串。

It's doable, but I'm hoping there is an existing means to do this without re-inventing the wheel.这是可行的,但我希望有一种现有的方法可以在不重新发明轮子的情况下做到这一点。

Edit: Thank you to @rzwitserloot for pointing me in the right direction, for anyone who sees this and does not read my comment in his answer, I ended up using JSoup .编辑:感谢@rzwitserloot 为我指出正确的方向,对于任何看到这一点但没有在他的回答中阅读我的评论的人,我最终使用了JSoup Here is a snippet of the final code section related to this on the Java side for anyone else working through this:以下是 Java 端与此相关的最终代码部分的片段,供其他任何人使用:

// Connect method opens a connection to the MySQL server 
connect();
// Query the MySQL server 
resultSet = statement.executeQuery("select * from order_tracking order by DateOrdered");

// If there is any result, iterate through them until the end is reached. 
while (resultSet.next()) { 
  // Add each returned row into the list to send to the table
  Jsoup.parse(resultSet.getString(2)).text()
.
.
.
}

The .text() at the end of the Jsoup.parse(String) gets rid of the html formatting (ie <Head><Body> etc) that Jsoup automatically throws in and returns only the text portion with the &#38; Jsoup.parse(String)末尾的.text()摆脱了 Jsoup 自动插入的 html 格式(即<Head><Body>等)并仅返回带有&#38;的文本部分&#38; (or whatever ascii value it might be) properly formatted. (或任何可能的 ascii 值)格式正确。

Thanks!谢谢!

The best solution is to just fix the initial bit: Of course databases (and mysql in general) can store emojis, but mysql is weird.最好的解决方案是只修复初始位:当然,数据库(以及一般的 mysql)可以存储表情符号,但 mysql 很奇怪。 utf8 isn't utf8, it's misnamed. utf8 不是 utf8,它被误命名了。 The real utf8 in mysql is called utf8mb4. mysql中真正的utf8叫做utf8mb4。 Use that encoding and you can store smiley's just fine.使用该编码,您就可以很好地存储笑脸。

If that option somehow doesn't work for you, your strings are HTML encoded.如果该选项以某种方式对您不起作用,则您的字符串是 HTML 编码的。 The solution is to HTML-decode them.解决方案是对它们进行 HTML 解码。 Java doesn't ship with one out of the box, you need a dependency. Java 不附带一个开箱即用的,您需要一个依赖项。 There's this, for example: http://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/StringEscapeUtils.html#unescapeHtml4(java.lang.String)有这个,例如: http : //commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/StringEscapeUtils.html#unescapeHtml4(java.lang.String)

You have HTML-escaped entities in your database.您的数据库中有 HTML 转义实体。 This isn't ideal, but it's easy to reverse.这并不理想,但很容易逆转。 Pass the string to html_entity_decode() to reverse this process.将字符串传递给html_entity_decode()以反转此过程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM