简体   繁体   English

Boost Spirit X3 中如何使用u8_to_u32_iterator?

[英]How to use u8_to_u32_iterator in Boost Spirit X3?

I am using Boost Spirit X3 to create a programming language, but when I try to support Unicode, I get an error!我正在使用 Boost Spirit X3 创建编程语言,但是当我尝试支持 Unicode 时,出现错误!
Here is an example of a simplified version of that program.这是该程序的简化版本的示例。

#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>

namespace x3 = boost::spirit::x3;

struct sample : x3::symbols<unsigned> {
    sample()
    {
        add("48", 10);
    }
};

int main()
{
  const std::string s("🌸");

  boost::u8_to_u32_iterator<std::string::const_iterator> first{cbegin(s)},
    last{cend(s)};

  x3::parse(first, last, sample{});
}

Live on wandbox住在魔杖盒上

What should I do?我应该怎么办?

As you noticed, internally char_encoding::unicode employs char32_t .如您所见, char_encoding::unicode在内部使用char32_t

So, first changing the symbols accordingly:因此,首先相应地更改symbols

template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;

struct sample : symbols<unsigned> {
    sample() { add(U"48", 10); }
};

Now the code fails calling into case_compare :现在代码无法调用case_compare

/home/sehe/custom/boost_1_78_0/boost/spirit/home/x3/string/detail/tst.hpp|74 col 33| error: no match for call to ‘(boost::spirit::x3::case_compare<boost::spirit::char_encoding::unicode>) (reference, char32_t&)’

As you can see it expects a char32_t reference, but u8_to_u32_iterator returns unsigned int s ( std::uint32_t ).如您所见,它需要一个char32_t引用,但u8_to_u32_iterator返回unsigned int s ( std::uint32_t )。

Just for comparison / sanity check: https://godbolt.org/z/1zozxq96W仅用于比较/完整性检查: https://godbolt.org/z/1zozxq96W

Luckily you can instruct the u8_to_u32_iterator to use another co-domain type:幸运的是,您可以指示u8_to_u32_iterator使用另一种共同域类型:

Live On Compiler Explorer在编译器资源管理器上运行

#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>

namespace x3 = boost::spirit::x3;

template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;

struct sample : symbols<unsigned> {
    sample() { add(U"48", 10)(U"🌸", 11); }
};

int main() {
    auto test = [](auto const& s) {
        boost::u8_to_u32_iterator<decltype(cbegin(s)), char32_t> first{
            cbegin(s)},
            last{cend(s)};

        unsigned parsed_value;
        if (x3::parse(first, last, sample{}, parsed_value)) {
            std::cout << s << " -> " << parsed_value << "\n";
        } else {
            std::cout << s << " FAIL\n";
        }
    };

    for (std::string s : {"🌸", "48", "🤷"})
        test(s);
}

Prints印刷

🌸 -> 11
48 -> 10
🤷 FAIL

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM