简体   繁体   English

使用generate将值散布到单独的Vector中

[英]Intersperse values into separate Vectors using generate

I am trying to generate a tuple of Vector s by using a function that creates a custom data type (or a tuple) of values from an index. 我正在尝试通过使用从索引创建值的自定义数据类型(或元组)的函数来生成Vector的元组。 Here is an approach that achieves the desired result: 这是一种可以达到预期效果的方法:

import Prelude hiding (map, unzip)
import Data.Vector hiding (map)
import Data.Array.Repa
import Data.Functor.Identity

data Foo = Foo {fooX :: Int, fooY :: Int}

unfoo :: Foo -> (Int, Int)
unfoo (Foo x y) = (x, y)

make :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make n f = unzip $ generate n getElt where
  getElt i = unfoo $ f i

Except that I would like to do it in a single iteration per Vector, almost like it is shown below, but avoiding multiple evaluation of function f : 除了我想对每个Vector进行一次迭代外,几乎如下所示,但避免对函数f多次求f

make' :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make' n f = (generate n getElt1, generate n getElt2) where
  getElt1 i = fooX $ f i
  getElt2 i = fooY $ f i

Just as a note, I understand that Vector library supports fusion, and the first example is already pretty efficient. 谨记一下,我了解Vector库支持融合,并且第一个示例已经非常有效。 I need a solution to generate concept, other libraries have very similar constructors (Repa has fromFunction for example), and I am using Vector s here simply to demonstrate a problem. 我需要一个generate概念的解决方案,其他库具有非常相似的构造函数(例如,Repa具有fromFunction ),并且我在这里使用Vector只是为了演示问题。

Maybe some sort of memoizing of f function call would work, but I cannot think of anything. 也许可以对f函数调用进行某种形式的记忆,但是我什么也没想到。

Edit: 编辑:

Another demonstration of the problem using Repa: 使用Repa的问题的另一个演示:

makeR :: Int -> (Int -> Foo) -> (Array U DIM1 Int, Array U DIM1 Int)
makeR n f = runIdentity $ do
  let arr = fromFunction (Z :. n) (\ (Z :. i) -> unfoo $ f i)
  arr1 <- computeP $ map fst arr
  arr2 <- computeP $ map snd arr
  return (arr1, arr2)

Same as with vectors, fusion saves the day on performance , but an intermediate array arr of tuples is still required, which I am trying to avoid. 与向量相同,融合可以节省性能 ,但是仍然需要一个元组的中间数组arr ,我试图避免这种情况。

Edit 2: (3 years later) 编辑2:(3年后)

In the Repa example above it will not create an intermediate array, since fromFunction creates a delayed array. 在上面的Repa示例中,它不会创建中间数组,因为fromFunction创建一个延迟数组。 Instead it will be even worse, it will evaluate f twice for each index, one for the first array, second time for the second array. 相反,情况会更糟,它将对每个索引计算f两次,对第一个数组一次,对第二个数组第二次。 Delayed array must be computed in order to avoid such duplication of work. 为了避免这种重复工作,必须计算延迟阵列。

Looking back at my own question from a few years ago I can now easily show what I was trying to do back than and how to get it done. 回顾几年前我自己的问题,我现在可以轻松地展示出我正在尝试做的事情,而不是如何完成它。

In short, it can't be done purely, therefore we need to resort to ST monad and manual mutation of two vectors, but in the end we do get this nice and pure function that creates only two vectors and does not rely on fusion. 简而言之,它不能纯粹地完成,因此我们需要求助于ST monad和两个向量的手动突变,但是最后我们确实获得了这个不错的纯函数,它仅创建两个向量并且不依赖融合。

import Control.Monad.ST
import Data.Vector.Primitive
import Data.Vector.Primitive.Mutable

data Foo = Foo {fooX :: Int, fooY :: Int}

make :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make n f = runST $ do
  let n' = max 0 n
  mv1 <- new n'
  mv2 <- new n'
  let fillVectors i
        | i < n' = let Foo x y = f i
                   in write mv1 i x >> write mv2 i y >> fillVectors (i + 1)
        | otherwise = return ()
  fillVectors 0
  v1 <- unsafeFreeze mv1
  v2 <- unsafeFreeze mv2
  return (v1, v2)

And the we use it in a similar fashion it is done with generate : 并且我们以与generate相似的方式使用它:

λ> make 10 (\ i -> Foo (i + i) (i * i))
([0,2,4,6,8,10,12,14,16,18],[0,1,4,9,16,25,36,49,64,81])

The essential thing you're trying to write is 您要编写的基本内容是

splat f = unzip . fmap f

which shares the results of evaluating f between the two result vectors, but you want to avoid the intermediate vector. 它在两个结果向量之间共享评估f的结果,但您要避免使用中间向量。 Unfortunately, I'm pretty sure you can't have it both ways in any meaningful sense. 不幸的是,我很确定您不会在任何有意义的意义上同时拥有这两种方式。 Consider a vector of length 1 for simplicity. 为简单起见,考虑长度为1的向量。 In order for the result vectors to share the result of f (v ! 0) , each will need a reference to a thunk representing that result. 为了使结果向量共享f (v ! 0) ,每个向量都需要引用表示该结果的thunk。 Well, that thunk has to be somewhere, and it really might as well be in a vector. 好吧,那个重击必须某个地方,它实际上也应该在向量中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM