简体   繁体   English

Data.Semigroup 中 ArgMin 和 ArgMax 类型同义词的用途是什么?

[英]What is the purpose of the ArgMin and ArgMax type synonyms in Data.Semigroup?

The base library in Haskell has the following type synonyms in Data.Semigroup : Haskell 中的base库在Data.Semigroup具有以下类型同义词:

type ArgMin a b = Min (Arg a b)

type ArgMax a b = Max (Arg a b) 

Here are links to the haddocks: ArgMin and ArgMax以下是黑线鳕的链接: ArgMinArgMax

What is the purpose of these two type synonyms?这两种类型的同义词的目的是什么? Where can they be used effectively?在哪里可以有效地使用它们?

It might be helpful to include an explanation of what the argmin and argmax functions do in mathematics, and how that is related to these type synonyms.解释 argmin 和 argmax 函数在数学中的作用以及它们与这些类型同义词的关系可能会有所帮助。


Here's a little extra information so you don't have to jump to Hackage.这里有一些额外的信息,因此您不必跳到 Hackage。

Here's the definition of Arg :这是Arg的定义:

-- | 'Arg' isn't itself a 'Semigroup' in its own right, but it can be
-- placed inside 'Min' and 'Max' to compute an arg min or arg max.
data Arg a b = Arg a b

Its doc string suggests that ArgMin and ArgMax can be placed inside of Min and Max to compute an arg min or an arg max.它的文档字符串表明可以将ArgMinArgMax放在MinMax以计算 arg min 或 arg max。

Min and Max look like the following: MinMax如下所示:

newtype Min a = Min { getMin :: a }

The Semigroup instance is interesting: Semigroup实例很有趣:

instance Ord a => Semigroup (Min a) where
  (<>) = coerce (min :: a -> a -> a)

It looks like it is using min as (<>) .看起来它正在使用min as (<>)

We can look at what the Ord instance looks like for Arg , since it is relevant here:我们可以看看Ord实例对于Arg样子,因为它在这里是相关的:

instance Ord a => Ord (Arg a b) where
  Arg a _ `compare` Arg b _ = compare a b
  min x@(Arg a _) y@(Arg b _)
    | a <= b    = x
    | otherwise = y
  max x@(Arg a _) y@(Arg b _)
    | a >= b    = x
    | otherwise = y

This appears to only run the comparison on the first type argument to Arg .这似乎只对Arg的第一个类型参数运行比较。

I suppose it's one of those things that exist in Haskell because the theoretical concept exists.我想它是 Haskell 中存在的事物之一,因为存在理论概念。 I'm not sure if these types have much practical use, but they do illustrate just how extensive the concepts of semigroups and monoids are in relation to programming.我不确定这些类型是否有很多实际用途,但它们确实说明了半群和幺半群的概念在编程方面的广泛性。

Imagine, for example, that you need to pick the longest of two names, name1 and name2 , both of them String values.想象一下,例如,您需要选择两个名称中最长的name1name2 ,它们都是String值。 You can use the Semigroup instance of ArgMax for that:您可以使用Semigroup的实例ArgMax为:

Prelude Data.Semigroup> Max (Arg (length name1) name1) <> Max (Arg (length name2) name2)
Max {getMax = Arg 5 "Alice"}

After that, it's just a question of unwrapping "Alice" from its container.在那之后,这只是从容器中解开"Alice"的问题。

As Willem Van Onsem points out in the comments, you can use ArgMax and ArgMin to pick the maximum or minimum item, according to some attribute of the item, but still keeping the original item around.正如 Willem Van Onsem 在评论中指出的那样,您可以使用ArgMaxArgMin根据项目的某些属性选择最大或最小项目,但仍保留原始项目。

The purpose of them is to implement things like minimumOn :它们的目的是实现minimumOn类的minimumOn

minimumOn :: (Ord b, Foldable f) => (a -> b) -> f a -> Maybe a
minimumOn f = fmap (getArg  . getMin)
            . getOption
            . foldMap (Option . Just . Min . (Arg =<< f))
            --                         ^^^^^^^^^^
            --                           ArgMin
  where
    getArg (Arg _ x) = x

While this implementation might look a little convoluted, it's often helpful to implement things using general concepts like monoids.虽然这个实现可能看起来有点复杂,但使用像幺半群这样的一般概念来实现东西通常是有帮助的。 For instance, in this case, it is straightforward to adapt the above code to compute the min and max in a single pass.例如,在这种情况下,可以直接修改上述代码以在单次传递中计算最小值和最大值。

I reach for ArgMin / ArgMax when:我在以下情况下使用ArgMin / ArgMax

  • I want to compute (a function of) the minimum/maximum of some values according to a comparison function我想根据比较函数计算某些值的最小值/最大值(的函数)

  • The comparison is costly or unwieldy to recompute, so I want to cache its result;重新计算比较昂贵笨拙,所以我想缓存它的结果; and/or和/或

  • I want to do it monoidally with foldMap instead of with an explicit/specialised minimumBy / maximumBy or sortOn , to leave it flexible to changes in the future such as a different monoid or parallelisation我想用foldMap而不是显式/专门的minimumBy / maximumBysortOn幺半群,让它灵活地适应未来的变化,比如不同的幺半群或并行化

Here's an adaptation of a recent real-world example from my job, findNextWorkerQueue , which takes a map from workers to tasks and finds the worker with the earliest first task, eg given this input:这是我的工作findNextWorkerQueue最近一个真实世界示例的改编版,它获取从工人到任务的映射,并找到最早执行第一个任务的工人,例如,给定以下输入:

  • Worker 1:工人 1:

    • Time 10: Task A时间 10:任务 A
    • Time 12: Task B时间 12:任务 B
    • Time 14: Task C时间 14:任务 C
  • Worker 2:工人 2:

    • Time 5: Task D时间 5:任务 D
    • Time 10: Task E时间 10:任务 E
    • Time 15: Task F时间 15:任务 F
  • Worker 3:工人 3:

    • Time 22: Task G时间 22:任务 G
    • Time 44: Task H时间 44:任务 H

It would produce a start time of 5, and a work queue describing worker 2, with a first task of D, and subsequent tasks of E & F.它将产生 5 的开始时间,以及描述工人 2 的工作队列,第一个任务是 D,随后的任务是 E 和 F。

{-# LANGUAGE ScopedTypeVariables #-}

import Data.Map       (Map)
import Data.Semigroup (Arg(..), Min(..), Option(..))
import Data.Sequence  (Seq(Empty, (:<|)))

import qualified Data.Map as Map

-- An enumeration of computation units for running tasks.
data WorkerId = …

-- The timestamp at which a task runs.
type Time = Int

-- Some kind of task scheduled at a timestamp.
data Scheduled task = Scheduled
  { schedAt   :: !Time
  , schedItem :: !task
  }

-- A non-empty sequence of work assigned to a worker.
data WorkQueue task = WorkQueue
  { wqId    :: !WorkerId
  , wqFirst :: !(Scheduled task)
  , wqRest  :: !(Seq (Scheduled task))
  }

-- | Find the lowest worker ID with the first scheduled task,
-- if any, and return its scheduled time and work queue.
findNextWorkerQueue
  :: forall task
  .  Map WorkerId (Seq (Scheduled task))
  -> Maybe (Time, WorkerQueue task)
findNextWorkerQueue
  = fmap getTimeAndQueue . getOption
  . foldMap (uncurry minWorkerTask) . Map.assocs
  where

    minWorkerTask
      :: WorkerId
      -> Seq (Scheduled task)
      -> Option (Min (Arg (Time, WorkerId) (WorkQueue task)))
    minWorkerTask wid tasks = Option $ case tasks of
      Empty -> Nothing
      t :<| ts -> Just $ Min $ Arg
        (schedTime t, wid)
        WorkQueue { wqId = wid, wqFirst = t, wqRest = ts }

    getTimeAndQueue
      :: Min (Arg (Time, WorkerId) (WorkQueue task))
      -> (Time, WorkQueue task)
    getTimeAndQueue (Min (Arg (time, _) queue))
      = (time, queue)

(Note that this is using Option to support GHC 8.6; in GHC ≥8.8, Maybe has an improved Monoid instance depending on Semigroup instead of Monoid , so we can use it with Min without imposing a Bounded constraint. The time signatures are just for clarity here.) (注意,这是使用Option支持GHC 8.6;在GHC≥8.8, Maybe有一个改进的Monoid取决于例如Semigroup ,而不是Monoid ,所以我们可以用它Min而不强加Bounded ,约束时间签名只是为了清楚起见,这里。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM