简体   繁体   English

从服务器响应中解析utf8字符串

[英]parsing utf8 string from server response

I had implemented app on some device which was dealing with sending receiving data from server. 我已经在某些用于处理从服务器发送接收数据的设备上实现了应用程序。 Data from server would usually come in this form: 来自服务器的数据通常以以下形式出现:

"1;username;someInteger;"

Parsing was easy, and I was using strtok as you can imagine to retrieve individual values from that string such as: 1 , username , and someInteger . 解析很容易,而且我正在使用strtok ,您可以想象从该字符串中检索单个值,例如: 1usernamesomeInteger

But now a situation may occur when the server will send me unicode string as username . 但是现在,当服务器将unicode字符串作为username发送给我时,可能会发生这种情况。

I think good idea is to use the username encoded as a UTF-8 string (am I right?). 我认为好主意是使用编码为UTF-8字符串的用户名(对吗?)。 What do you recommend - how should I parse it from above string? 您有什么建议-我应该如何从上述字符串中进行解析? What symbol to use as a separator for example (eg, instead of ";"), or which functions to use to extract the username from above string? 例如,将哪个符号用作分隔符(例如,而不是“;”),或者使用哪个函数从上述字符串中提取username

as this is some embedded device I want to avoid installing some third party libraries there (which might not be even possible) so more "pure" ways would be more desirable. 因为这是某种嵌入式设备,所以我要避免在其中安装一些第三方库(甚至可能无法安装),因此更希望采用“纯”方法。

The character ';' 字符';' is the same in UTF-8 as it is in ASCII, because the 127 first characters in both encodings are the same. 在UTF-8中与在ASCII中是相同的,因为两种编码的前127个字符相同的。 That means you can still use strtok to split on the ';' 这意味着您仍然可以使用strtok分割';' .

The very thing with UTF8 is that you hardly have to do anything at all. UTF8的本质是您几乎不需要执行任何操作。 ASCII characters still encode as the same ASCII bytes they always would, so if you just continue to use semicolon separators, you don't have to do anything at all. ASCII字符仍然会像以前一样编码为相同的ASCII字节,因此,如果您继续使用分号分隔符,则根本无需执行任何操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM