简体   繁体   English

算法直觉:转换但可能比输入项目更多的 output 项目?

[英]Algorithm intuition: transform but with possibly more output items than input items?

I want to escape a string.我想转义一个字符串。 That is, I want to copy characters, but where the input has " or \ , I want to prepend \ on the output. I can write that easily enough, even very generically:也就是说,我想复制字符,但是在输入有"\的地方,我想在 output 上添加\ 。我可以很容易地写出来,甚至非常笼统:

//! Given a range of code units (e.g., bytes in utf-8), transform them to an output range.
//! If a code unit needs escaping (per the given predicate), insert esc before that code unit in the output.
template <typename InRange, typename OutIter, typename NeedsEscapingPred, typename EscCodeUnit>
OutIter transformEscapeCodeUnits(const InRange& in, OutIter out,
     NeedsEscapingPred needsEscaping,
     EscChar esc) {
     for (const auto& codeUnit : in) {
         if (needsEscaping(codeUnit)) {
             *out++ = esc;
         }
         *out++ = codeUnit;
     }
     return out;
}

//! Convenience overload for common case:
template <typename InRange, typename OutIter>
OutIter transformEscapeCodeUnits(const InRange& in, OutIter out, char esc = '\') {
    return transformEscapeCodeUnits(in, out, [](auto c) { return c == '\' || c == '"'; }, esc);
}

However, in the spirit of "no raw loops", I looked at the algorithm and numeric header in search of a generic algorithm to do this.但是,本着“无原始循环”的精神,我查看了算法和数字 header 以寻找通用算法来执行此操作。 There's replace_if and replace_copy_if and remove_if , but I'm not seeing any std algorithms that take a sequence and output a potentially-longer sequence.replace_ifreplace_copy_ifremove_if ,但我没有看到任何采用序列的 std 算法,而 output 可能是更长的序列。 This would be basically insert_copy_if , or even more generically, something like transform_items :这基本上是insert_copy_if ,或者更一般地说,类似于transform_items

//! Like transform, but TransformItem takes an element and an iterator and writes zero or more output elements:
template <typename InRange, typename OutIter, typename TransformItem>
OutIter transform_items(InRange&& inRange, OutIter out, TxFn transformItem) {
    for (auto&& x : std::forward<InRange>(inRange)) {
        out = transformItem(std::forwrad<decltype(x)>(x), out);
    }
    return out;
}

Then the escaping case would call transform_items(in, out, [shouldEsc, esc](auto c, auto out) { if (shouldEsc(c)) { *out++ = esc; } *out++ = c; }) . Then the escaping case would call transform_items transform_items(in, out, [shouldEsc, esc](auto c, auto out) { if (shouldEsc(c)) { *out++ = esc; } *out++ = c; }) .

Am I missing something, or is there nothing quite like that in the standard library?我是否遗漏了什么,或者标准库中没有类似的东西?

After asking on Twitter, I got some very satisfying answers, although it involves the M-word (monad).在询问了 Twitter 之后,我得到了一些非常令人满意的答案,尽管它涉及到 M 字(monad)。

First, @atorstling suggests that this is flatMap in JavaScript: https://twitter.com/atorstling/status/1574097704098988033?s=20&t=jzS503R6fMOqCajReygzJg https://dmitripavlutin.com/javascript-array-flatmap/ First, @atorstling suggests that this is flatMap in JavaScript: https://twitter.com/atorstling/status/1574097704098988033?s=20&t=jzS503R6fMOqCajReygzJg https://dmitripavlutin.com/javascript-array-flatmap/

So translating to C++, I think the answer to my question is that this should be called flat_transform .所以翻译成 C++,我认为我的问题的答案是这应该被称为flat_transform With pedantic use of forwarding, that's:迂腐地使用转发,那就是:

//! Like transform, but TransformItem takes an element and an iterator and writes zero or more output elements:
template <typename InRange, typename OutIter, typename TransformOne>
OutIter flat_transform(InRange&& inRange, 
                       OutIter out,
                       TransformOne transform_one) {
    for (auto&& x : std::forward<InRange>(inRange)) {
        out = transform_one(std::forwrad<decltype(x)>(x), std::move(out));
    }
    return out;
}

and yes, this appears to be a "missing algorithm" from the STL.是的,这似乎是 STL 中的“缺失算法”。

But additionally, @salociN001 pointed out that "It's basically the monadic bind operation: Transforming a Container of A with a function taking A and returning a container of A into a flat container of A."但另外,@salociN001 指出“这基本上是一元绑定操作:使用 function 转换 A 的容器,然后将 A 的容器返回到 A 的平面容器中。” https://twitter.com/salociN001/status/1573918982557548545?s=20&t=jzS503R6fMOqCajReygzJg https://twitter.com/salociN001/status/1573918982557548545?s=20&t=jzS503R6fMOqCajReygzJg

So there we have it:因此,我们有它:

  • It's monadic bind.这是一元绑定。
  • A reasonable C++ name is flat_transform .一个合理的 C++ 名称是flat_transform
  • It is, in fact, a missing algorithm.事实上,它是一个缺失的算法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM