简体   繁体   English

转换宽字符串以提升日期

[英]Convert wide character strings to boost dates

I need to convert several million dates stored as wide strings into boost dates 我需要将存储为宽字符串的数百万个日期转换为提升日期

The following code works. 以下代码有效。 However, it generates a horrible compiler warning and does not seem efficient. 但是,它会生成一个可怕的编译器警告,并且似乎效率不高。

Is there a better way? 有没有更好的办法?

#include "boost/date_time/gregorian/gregorian.hpp"
using namespace boost::gregorian;

#include <string>
using namespace std;


    wstring ws( L"2008/01/01" );

    string temp(ws.length(), '\0');
    copy(ws.begin(), ws.end(), temp.begin());
    date d1( from_simple_string( temp ) );

    cout << d1;

The better way turns out to be to use the standard C++ library locale , which is a collection of facets . 更好的方法是使用标准C ++库语言环境 ,这是一个方面的集合。 A facet is a service which allows the stream operators to handle a particular choice for date or time representation or just about anything else. 方面是一种服务,它允许流操作员处理日期或时间表示的特定选择或其他任何事情。 All the choices about diferent things, each handled by its own facet, are gathered together in a locale. 关于不同事物的所有选择,每个都由它自己的方面处理,在一个场所聚集在一起。

This solution was pointed out to me by litb who gave me enough help to use facets in my production code, making it terser and faster. litb向我指出了这个解决方案,他给了我足够的帮助,在我的生产代码中使用了facet,使它变得更加简洁和快速。 Thanks. 谢谢。

There is an excellent tutorial on locales and facets by Nathan Myers who designed facets. 设计方面的Nathan Myers提供了一个关于语言环境和方面的优秀教程 He has a light style which makes his tutorial easy to read, though this is advanced stuff and your brain may hurt after the first read through - mine did. 他有一个轻松的风格,使他的教程易于阅读,虽然这是先进的东西,你的大脑可能会在第一次阅读后受伤 - 我的。 I suggest you go there now. 我建议你现在去那儿。 For anyone who just wants the practicalities of converting wide character strings to boost dates, the rest of this post describes the minimum necessary to make it work. 对于那些只想要将宽字符串转换为提升日期的实用性的人来说,本文的其余部分描述了使其工作的最低要求。


litb first offered the following simple solution that removes the compiler warning. litb首先提供了以下简单的解决方案,删除编译器警告。 ( The solution was edited before I got around to accepting it. ) This looks like it does the same thing, converting wide characters one by one, but it avoids mucking around with temp strings and therefore is much clearer, I think. (解决方案在我开始接受之前进行了编辑。)这看起来像是做同样的事情,逐个转换宽字符,但它避免了使用临时字符串,因此我认为更清晰。 I really like that the compiler warning is gone. 我真的很喜欢编译器警告消失了。

#include "boost/date_time/gregorian/gregorian.hpp"
using namespace boost::gregorian;

#include <string>
using namespace std;


    wstring ws( L"2008/01/01" );

    date d1( from_simple_string( string( ws.begin(), ws.end() ) );

    cout << d1;

litb went on to suggest using "facets", which I had never heard of before. litb继续建议使用我以前从未听说过的“facets”。 They seem to do the job, producing incredibly terse code inside the loop, at the cost of a prologue where the locale is set up. 他们似乎做了这项工作,在循环中产生令人难以置信的简洁代码,代价是设置语言环境的序幕。

wstring ws( L"2008/01/01" );

// construct a locale to collect all the particulars of the 'greek' style
locale greek_locale;
// construct a facet to handle greek dates - wide characters in 2008/Dec/31 format
wdate_input_facet greek_date_facet(L"%Y/%m/%d");
// add facet to locale
greek_locale = locale( greek_locale, &greek_date_facet );
// construct stringstream to use greek locale
std::wstringstream greek_ss; 
greek_ss.imbue( greek_locale );

date d2;

greek_ss << ws;
greek_ss >> d2;

cout << d2;

This, it turns out, is also more efficient: 事实证明,这也更有效:

clock_t start, finish;
double  duration;

start = clock();
for( int k = 0; k < 100000; k++ ) {
    string temp(ws.length(), '\0');
    copy(ws.begin(), ws.end(), temp.begin());
    date d1( from_simple_string( temp ) );
}
finish = clock();
duration = (double)(finish - start) / CLOCKS_PER_SEC;
cout << "1st method: " << duration << endl;

start = clock();
for( int k = 0; k < 100000; k++ ) {
    date d1( from_simple_string( string( ws.begin(), ws.end() ) ) );
}
finish = clock();
duration = (double)(finish - start) / CLOCKS_PER_SEC;
cout << "2nd method: " << duration << endl;

start = clock();
for( int k = 0; k < 100000; k++ ) {
    greek_ss << ws;
    greek_ss >> d2;
    ss.clear();
}
finish = clock();
duration = (double)(finish - start) / CLOCKS_PER_SEC;
cout << "3rd method: " << duration << endl;

Produces the following output: 产生以下输出:

1st method: 2.453
2nd method: 2.422
3rd method: 1.968

OK, this is now in the production code and passing regression tests. 好的,现在这是生产代码并通过回归测试。 It looks like this: 它看起来像这样:

  //  .. construct greek locale and stringstream 

  // ... loop over input extracting date strings

        // convert range to boost dates
        date d1;
        greek_ss<< sd1; greek_ss >> d1;
        if( greek_ss.fail() ) {
                       // input is garbled
            wcout << L"do not understand " << sl << endl;
            exit(1);
        }
         greek_ss.clear();

// finish processing and end loop

I have one final question about this. 我有一个关于此的最后一个问题。 Adding the facet to the locale seems to require two invocations of the locale copy constructor 将构面添加到区域设置似乎需要对区域设置复制构造函数进行两次调用

    // add facet to locale
greek_locale = locale( greek_locale, &greek_date_facet );

Why is there not an add( facet* ) method? 为什么没有add(facet *)方法? ( _Addfac() is complex, undocumented and deprecated ) (_Addfac()是复杂的,未记录的并且已弃用)

efotinis found a good way using from_stream . efotinis找到了一个使用from_stream的好方法。


I've looked into the manual of date_time and found it supports facets: 我查看了date_time的手册,发现它支持facet:

#include <boost/date_time/gregorian/gregorian.hpp>
#include <iostream>
#include <sstream>
#include <locale>

int main() {
    using namespace boost::gregorian;

    std::wstringstream ss;
    wdate_input_facet * fac = new wdate_input_facet(L"%Y-%m-%d");
    ss.imbue(std::locale(std::locale::classic(), fac));

    date d;
    ss << L"2004-01-01 2005-01-01 2006-06-06";
    while(ss >> d) {
        std::cout << d << std::endl;
    }
}

You could also go with that. 你也可以这样做。


I've looked up how date facets work: 我已经查明了日期方面的工作原理:

  • The boost::date_time::date_input_facet template implements a facet. boost::date_time::date_input_facet模板实现了一个方面。
  • Facets are derived from std::locale::facet and every one has an unique id. std::locale::facet派生自std::locale::facet ,每个构件都有唯一的id。
  • You can imbue a new locale into a stream, replacing its old locale. 您可以灌输一个新的区域设置成流,取代旧的语言环境。 The locale of a stream will be used for all sorts of parsing and conversions. 流的区域设置将用于各种解析和转换。
  • When you create a new std::locale using the form i showed, you give it an existing locale, and a pointer to facet. 当您使用我显示的表单创建一个新的std::locale ,您可以为它提供一个现有的语言环境和一个指向facet的指针。 The given facet will replace any existing facet of the same type in the locale given. 给定的facet将替换给定语言环境中相同类型的任何现有facet。 (so, it would replace any other date_input_facet used). (因此,它将替换使用的任何其他date_input_facet)。
  • All facets are associated with the locale somehow, so that you can use std::has_facet<Facet>(some_locale) to check whether the given locale has some given facet type. 所有方面都以某种方式与区域设置相关联,因此您可以使用std::has_facet<Facet>(some_locale)来检查给定的区域设置是否具有某些给定的facet类型。
  • You can use a facet from one locale by doing std::use_facet<Facet>(some_locale).some_member... . 您可以通过执行std::use_facet<Facet>(some_locale).some_member...来使用来自一个语言环境的std::use_facet<Facet>(some_locale).some_member...
  • date_input_facet has a function get, which can be used like this: date_input_facet有一个函数get,可以像这样使用:

The below is essentially done by operator>> by boost::date_type : 以下基本上由operator>> by boost :: date_type完成:

// assume src is a stream having the wdate_input_facet in its locale. 
// wdate_input_facet is a boost::date_time::date_input_facet<date,wchar_t> typedef.

date d;

// iterate over characters of src
std::istreambuf_iterator<wchar_t> b(src), e;

// use the facet to parse the date
std::use_facet<wdate_input_facet>(src.getloc()).get(b, e, src, d);

You can use the from_stream parser function: 您可以使用from_stream解析器函数:

using boost::gregorian::date;
using boost::gregorian::from_stream;

std::wstring ws( L"2008/01/01" );
date d1(from_stream(ws.begin(), ws.end()));
std::cout << d1;  // prints "2008-Jan-01"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用带有宽字符串的 boost::iostreams::mapped_file_source - Using boost::iostreams::mapped_file_source with wide character strings boost测试是否支持宽字符串? - Does boost test have support for wide strings? 在例外中是否存在处理宽字符串的典型模式? - Is there a typical pattern for handling wide character strings in exceptions? 如何使用boost ptree解析像中文这样的宽字节字符串? Boost版本是boost_1_64_0 - How to use boost ptree to parse wide byte strings like chinese ? boost version is boost_1_64_0 比较两个宽字符串Visual C ++ - compare two wide character strings visual c++ 我们什么时候应该选择宽字符串呢? - When should we prefer wide-character strings? 无法将Boost路径迭代器转换为字符串 - Cannot convert Boost path iterators to strings 如何使用C预处理器将连接字符串转换为wide-char? - How to convert concatenated strings to wide-char with the C preprocessor? 使用宽字符串时解析罗马数字时,Boost Spirit库无法正常工作 - Boost spirit library not working properly while parsing roman numerals when using wide strings 提升正则表达式 cpp 以查找 %% 之间的字符串,输出不包括 % 字符本身 - Boost regex cpp for finding strings between %% with output excluding the % character itself
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM