简体   繁体   English

可能的字符编码问题?

[英]Possible character encoding issues?

I am making a simple blog android app where users will be able to add/view blogs.Now for blog addition,there is a simple text-view where the users can add the blog content(blog text). 我正在制作一个简单的博客android应用,用户可以在其中添加/查看博客。现在要添加博客,现在有一个简单的文本视图,用户可以在其中添加博客内容(博客文本)。 That blog content is then transmitted to a php script via HttpPost, which then stores it in a MySQL database. 然后,该博客内容通过HttpPost传输到php脚本,然后将其存储在MySQL数据库中。

Now my problem is that users can copy-paste text for the blog content into the text-view.The source for this copy-paste could be anywhere ranging from internet pages to textbooks. 现在我的问题是用户可以将博客内容的文本复制粘贴到文本视图中。复制粘贴的源可以是从互联网页面到教科书的任何地方。 Also the text could be of any font,color etc.This is possibly leading to character encoding issues,cos whenever i try to copy-paste text into blog body, the blog submission fails...otherwise it works fine.My MySQL database collation is UTF-8. 文本也可以是任何字体,颜色等。这可能导致字符编码问题,因为每当我尝试将文本复制粘贴到博客正文中时,博客提交失败...否则它将正常工作。我的MySQL数据库排序规则是UTF-8。

My question is :How to convert text from any possible source with any encoding to utf-8? 我的问题是:如何将文本从任何可能的来源以任何编码转换为utf-8?

Take a look at https://github.com/neitanod/forceutf8 看看https://github.com/neitanod/forceutf8

From their docs: 从他们的文档:

You don't need to know what the encoding of your strings is. 您不需要知道字符串的编码是什么。 It can be Latin1 (iso 8859-1), Windows-1252 or UTF8, or the string can have a mix of them. 它可以是Latin1(iso 8859-1),Windows-1252或UTF8,或者字符串可以混合使用。 \\ForceUTF8\\Encoding::toUTF8() will convert everything to UTF8. \\ ForceUTF8 \\ Encoding :: toUTF8()会将所有内容转换为UTF8。

Sometimes you have to deal with services that are unreliable in terms of encoding, possibly mixing UTF8 and Latin1 in the same string 有时,您必须处理在编码方面不可靠的服务,可能在同一字符串中混合使用UTF8和Latin1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM