[英]Convert wide character strings to boost dates
I need to convert several million dates stored as wide strings into boost dates 我需要将存储为宽字符串的数百万个日期转换为提升日期
The following code works. 以下代码有效。 However, it generates a horrible compiler warning and does not seem efficient. 但是,它会生成一个可怕的编译器警告,并且似乎效率不高。
Is there a better way? 有没有更好的办法?
#include "boost/date_time/gregorian/gregorian.hpp"
using namespace boost::gregorian;
#include <string>
using namespace std;
wstring ws( L"2008/01/01" );
string temp(ws.length(), '\0');
copy(ws.begin(), ws.end(), temp.begin());
date d1( from_simple_string( temp ) );
cout << d1;
The better way turns out to be to use the standard C++ library locale , which is a collection of facets . 更好的方法是使用标准C ++库语言环境 ,这是一个方面的集合。 A facet is a service which allows the stream operators to handle a particular choice for date or time representation or just about anything else. 方面是一种服务,它允许流操作员处理日期或时间表示的特定选择或其他任何事情。 All the choices about diferent things, each handled by its own facet, are gathered together in a locale. 关于不同事物的所有选择,每个都由它自己的方面处理,在一个场所聚集在一起。
This solution was pointed out to me by litb who gave me enough help to use facets in my production code, making it terser and faster. litb向我指出了这个解决方案,他给了我足够的帮助,在我的生产代码中使用了facet,使它变得更加简洁和快速。 Thanks. 谢谢。
There is an excellent tutorial on locales and facets by Nathan Myers who designed facets. 设计方面的Nathan Myers提供了一个关于语言环境和方面的优秀教程 。 He has a light style which makes his tutorial easy to read, though this is advanced stuff and your brain may hurt after the first read through - mine did. 他有一个轻松的风格,使他的教程易于阅读,虽然这是先进的东西,你的大脑可能会在第一次阅读后受伤 - 我的。 I suggest you go there now. 我建议你现在去那儿。 For anyone who just wants the practicalities of converting wide character strings to boost dates, the rest of this post describes the minimum necessary to make it work. 对于那些只想要将宽字符串转换为提升日期的实用性的人来说,本文的其余部分描述了使其工作的最低要求。
litb first offered the following simple solution that removes the compiler warning. litb首先提供了以下简单的解决方案,删除编译器警告。 ( The solution was edited before I got around to accepting it. ) This looks like it does the same thing, converting wide characters one by one, but it avoids mucking around with temp strings and therefore is much clearer, I think. (解决方案在我开始接受之前进行了编辑。)这看起来像是做同样的事情,逐个转换宽字符,但它避免了使用临时字符串,因此我认为更清晰。 I really like that the compiler warning is gone. 我真的很喜欢编译器警告消失了。
#include "boost/date_time/gregorian/gregorian.hpp"
using namespace boost::gregorian;
#include <string>
using namespace std;
wstring ws( L"2008/01/01" );
date d1( from_simple_string( string( ws.begin(), ws.end() ) );
cout << d1;
litb went on to suggest using "facets", which I had never heard of before. litb继续建议使用我以前从未听说过的“facets”。 They seem to do the job, producing incredibly terse code inside the loop, at the cost of a prologue where the locale is set up. 他们似乎做了这项工作,在循环中产生令人难以置信的简洁代码,代价是设置语言环境的序幕。
wstring ws( L"2008/01/01" );
// construct a locale to collect all the particulars of the 'greek' style
locale greek_locale;
// construct a facet to handle greek dates - wide characters in 2008/Dec/31 format
wdate_input_facet greek_date_facet(L"%Y/%m/%d");
// add facet to locale
greek_locale = locale( greek_locale, &greek_date_facet );
// construct stringstream to use greek locale
std::wstringstream greek_ss;
greek_ss.imbue( greek_locale );
date d2;
greek_ss << ws;
greek_ss >> d2;
cout << d2;
This, it turns out, is also more efficient: 事实证明,这也更有效:
clock_t start, finish;
double duration;
start = clock();
for( int k = 0; k < 100000; k++ ) {
string temp(ws.length(), '\0');
copy(ws.begin(), ws.end(), temp.begin());
date d1( from_simple_string( temp ) );
}
finish = clock();
duration = (double)(finish - start) / CLOCKS_PER_SEC;
cout << "1st method: " << duration << endl;
start = clock();
for( int k = 0; k < 100000; k++ ) {
date d1( from_simple_string( string( ws.begin(), ws.end() ) ) );
}
finish = clock();
duration = (double)(finish - start) / CLOCKS_PER_SEC;
cout << "2nd method: " << duration << endl;
start = clock();
for( int k = 0; k < 100000; k++ ) {
greek_ss << ws;
greek_ss >> d2;
ss.clear();
}
finish = clock();
duration = (double)(finish - start) / CLOCKS_PER_SEC;
cout << "3rd method: " << duration << endl;
Produces the following output: 产生以下输出:
1st method: 2.453 2nd method: 2.422 3rd method: 1.968
OK, this is now in the production code and passing regression tests. 好的,现在这是生产代码并通过回归测试。 It looks like this: 它看起来像这样:
// .. construct greek locale and stringstream
// ... loop over input extracting date strings
// convert range to boost dates
date d1;
greek_ss<< sd1; greek_ss >> d1;
if( greek_ss.fail() ) {
// input is garbled
wcout << L"do not understand " << sl << endl;
exit(1);
}
greek_ss.clear();
// finish processing and end loop
I have one final question about this. 我有一个关于此的最后一个问题。 Adding the facet to the locale seems to require two invocations of the locale copy constructor 将构面添加到区域设置似乎需要对区域设置复制构造函数进行两次调用
// add facet to locale
greek_locale = locale( greek_locale, &greek_date_facet );
Why is there not an add( facet* ) method? 为什么没有add(facet *)方法? ( _Addfac() is complex, undocumented and deprecated ) (_Addfac()是复杂的,未记录的并且已弃用)
efotinis found a good way using from_stream . efotinis找到了一个使用from_stream的好方法。
I've looked into the manual of date_time
and found it supports facets: 我查看了date_time
的手册,发现它支持facet:
#include <boost/date_time/gregorian/gregorian.hpp>
#include <iostream>
#include <sstream>
#include <locale>
int main() {
using namespace boost::gregorian;
std::wstringstream ss;
wdate_input_facet * fac = new wdate_input_facet(L"%Y-%m-%d");
ss.imbue(std::locale(std::locale::classic(), fac));
date d;
ss << L"2004-01-01 2005-01-01 2006-06-06";
while(ss >> d) {
std::cout << d << std::endl;
}
}
You could also go with that. 你也可以这样做。
I've looked up how date facets work: 我已经查明了日期方面的工作原理:
boost::date_time::date_input_facet
template implements a facet. boost::date_time::date_input_facet
模板实现了一个方面。 std::locale::facet
and every one has an unique id. std::locale::facet
派生自std::locale::facet
,每个构件都有唯一的id。 std::locale
using the form i showed, you give it an existing locale, and a pointer to facet. 当您使用我显示的表单创建一个新的std::locale
,您可以为它提供一个现有的语言环境和一个指向facet的指针。 The given facet will replace any existing facet of the same type in the locale given. 给定的facet将替换给定语言环境中相同类型的任何现有facet。 (so, it would replace any other date_input_facet used). (因此,它将替换使用的任何其他date_input_facet)。 std::has_facet<Facet>(some_locale)
to check whether the given locale has some given facet type. 所有方面都以某种方式与区域设置相关联,因此您可以使用std::has_facet<Facet>(some_locale)
来检查给定的区域设置是否具有某些给定的facet类型。 std::use_facet<Facet>(some_locale).some_member...
. 您可以通过执行std::use_facet<Facet>(some_locale).some_member...
来使用来自一个语言环境的std::use_facet<Facet>(some_locale).some_member...
。 The below is essentially done by operator>>
by boost::date_type : 以下基本上由operator>>
by boost :: date_type完成:
// assume src is a stream having the wdate_input_facet in its locale.
// wdate_input_facet is a boost::date_time::date_input_facet<date,wchar_t> typedef.
date d;
// iterate over characters of src
std::istreambuf_iterator<wchar_t> b(src), e;
// use the facet to parse the date
std::use_facet<wdate_input_facet>(src.getloc()).get(b, e, src, d);
You can use the from_stream parser function: 您可以使用from_stream解析器函数:
using boost::gregorian::date;
using boost::gregorian::from_stream;
std::wstring ws( L"2008/01/01" );
date d1(from_stream(ws.begin(), ws.end()));
std::cout << d1; // prints "2008-Jan-01"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.