[英]boost to_upper function of string_algo doesn't take into account the locale
I have a problem with the functions in the string_algo package. 我对string_algo包中的函数有问题。
Consider this piece of code: 考虑一下这段代码:
#include <boost/algorithm/string.hpp>
int main() {
try{
string s = "meißen";
locale l("de_DE.UTF-8");
to_upper(s, l);
cout << s << endl;
catch(std::runtime_error& e){
cerr << e.what() << endl;
}
try{
string s = "composición";
locale l("es_CO.UTF-8");
to_upper(s, l);
cout << s << endl;
catch(std::runtime_error& e){
cerr << e.what() << endl;
}
}
The expected output for this code would be: 此代码的预期输出将是:
MEISSEN
COMPOSICIÓN
however the only thing I get is 但我唯一得到的是
MEIßEN
COMPOSICIóN
so, clearly the locale is not being taken into account. 所以,显然没有考虑到语言环境。 I even try to set the global locale with no success.
我甚至尝试设置全局语言环境没有成功。 What can I do?
我能做什么?
In addition to the answer of Éric Malenfant -- std::locale
facets works on single character. 除了ÉricMalenfant的答案 -
std::locale
facets适用于单个字符。 To get better result you may use std::wstring
-- thus more characters would be converterd, but as you can see it is still not perfect (example ß). 为了获得更好的结果,你可以使用
std::wstring
- 因此会转换更多的字符,但正如你所看到的那样仍然不完美(例如ß)。
I would suggest to give a try to Boost.Locale (new library for boost, not yet in boost), that does stuff 我建议尝试一下Boost.Locale(用于提升的新库,还没有用于提升),这样做有用
http://cppcms.sourceforge.net/boost_locale/docs/ , http://cppcms.sourceforge.net/boost_locale/docs/ ,
Especially see http://cppcms.sourceforge.net/boost_locale/docs/index.html#conversions that deals with the problem you are talking about. 特别是请参阅http://cppcms.sourceforge.net/boost_locale/docs/index.html#conversions来处理您正在讨论的问题。
std::toupper assumes a 1:1 conversion, so there is no hope for the ß to SS case, Boost.StringAlgo or not. std :: toupper假定转换为1:1,所以对于ß到SS的情况,Boost.StringAlgo没有希望。
Looking at StringAlgo's code , we see that it does use the locale (Except on Borland, it seems). 看看StringAlgo的代码 ,我们看到它确实使用了语言环境(看起来除了Borland之外)。 So, for the other case, I'm curious: What is the result of
toupper('ó', std::locale("es_CO.UTF-8"))
on your platform? 所以,对于另一种情况,我很好奇:你的平台上
toupper('ó', std::locale("es_CO.UTF-8"))
是什么?
Writing the above makes me think about something else: What is the encoding of the strings in your sources? 写上面的内容让我想到了其他的东西:源代码中字符串的编码是什么? UTF8?
UTF8? In that case, std::toupper will see two code units for 'ó', so there is no hope.
在这种情况下,std :: toupper会看到'ó'的两个代码单元,所以没有希望。 Latin1?
Latin1的? In that case, using a locale named ".UTF-8" is inconsistent.
在这种情况下,使用名为“.UTF-8”的区域设置是不一致的。
In the standard library there is std::toupper (which boost::to_upper uses) that operates on one character at a time. 在标准库中有std :: toupper(boost :: to_upper使用),它一次对一个字符进行操作。
This explains why the ß doesn't work. 这解释了为什么ß不起作用。 You didn't say which standard library and codepage you are using so I don't know why the ó didn't work.
你没有说你正在使用哪个标准库和代码页,所以我不知道为什么ó不起作用。
What happens if you use wstring instead? 如果你使用wstring会发生什么?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.