简体   繁体   English

Haskell代码散落着TVar操作和函数带来许多争论:代码味道?

[英]Haskell code littered with TVar operations and functions taking many arguments: code smell?

I'm writing a MUD server in Haskell (MUD = Multi User Dungeon: basically, a multi-user text adventure/role-playing game). 我正在Haskell中编写MUD服务器(MUD =多用户地牢:基本上是一个多用户文本冒险/角色扮演游戏)。 The game world data/state is represented in about 15 different IntMap s. 游戏世界数据/状态在大约15种不同的IntMap表示。 My monad transformer stack looks like this: ReaderT MudData IO , where the MudData type is a record type containing the IntMap s, each in its own TVar (I'm using STM for concurrency): 我的monad变换器堆栈看起来像这样: ReaderT MudData IO ,其中MudData类型是包含IntMap的记录类型,每个都在自己的TVar (我使用STM进行并发):

data MudData = MudData { _armorTblTVar    :: TVar (IntMap Armor)
                       , _clothingTblTVar :: TVar (IntMap Clothing)
                       , _coinsTblTVar    :: TVar (IntMap Coins)

...and so on. ...等等。 (I'm using lenses, thus the underscores.) (我正在使用镜头,因此是下划线。)

Some functions need certain IntMap s, while other functions need others. 有些函数需要某些IntMap ,而其他函数需要其他函数。 Thus, having each IntMap in its own TVar provides granularity. 因此,将每个IntMap放在其自己的TVar可提供粒度。

However, a pattern has emerged in my code. 但是,我的代码中出现了一种模式。 In the functions that handle player commands, I need to read (and sometimes later write) to my TVar s within the STM monad. 在处理播放器命令的函数中,我需要在STM monad中读取(有时稍后写入)我的TVar Thus these functions end up having an STM helper defined in their where blocks. 因此,这些函数最终在其where块中定义了STM助手。 These STM helpers often have quite a few readTVar operations in them, as most commands need to access a handful of the IntMap s. 这些STM助手通常在其中有相当多的readTVar操作,因为大多数命令需要访问少数IntMap Furthermore, a function for a given command may call out to a number of pure helper functions that also need some or all of the IntMap s. 此外,给定命令的函数可以调用许多纯辅助函数,这些函数也需要部分或全部IntMap These pure helper functions thus sometimes end up taking a lot of arguments (sometimes over 10). 因此,这些纯辅助函数有时会占用大量参数(有时超过10)。

So, my code has become "littered" with lots of readTVar expressions and functions that take a large number of arguments. 因此,我的代码变得“乱七八糟”,其中包含大量带有大量参数的readTVar表达式和函数。 Here are my questions: is this a code smell? 以下是我的问题:这是代码味道吗? Am I missing some abstraction that would make my code more elegant? 我错过了一些可以使我的代码更优雅的抽象吗? Is there a more ideal way to structure my data/code? 有没有更理想的方法来构建我的数据/代码?

Thanks! 谢谢!

The solution to this problem is in changing the pure helper functions. 这个问题的解决方案是改变纯辅助函数。 We don't really want them to be pure, we want to leak out a single side-effect - whether or not they read specific pieces of data. 我们并不真的希望它们是纯粹的,我们想要泄漏一个副作用 - 无论它们是否读取特定的数据。

Let's say we have a pure function that uses only clothing and coins: 假设我们有一个仅使用衣服和硬币的纯功能:

moreVanityThanWealth :: IntMap Clothing -> IntMap Coins -> Bool
moreVanityThanWealth clothing coins = ...

It's usually nice to know that a function only cares about eg clothing and coins, but in your case this knowledge is irrelevant and is just creating headaches. 通常很高兴知道一个功能只关心衣服和硬币,但在你的情况下,这种知识是无关紧要的,只会造成头痛。 We are going to deliberately forget this detail. 我们会刻意忘记这个细节。 If we followed mb14's suggestion, we would pass an entire pure MudData' like the following to the helper functions. 如果我们遵循mb14的建议,我们会将完整的纯MudData'如下所示)传递给辅助函数。

data MudData' = MudData' { _armorTbl    :: IntMap Armor
                         , _clothingTbl :: IntMap Clothing
                         , _coinsTbl    :: IntMap Coins

moreVanityThanWealth :: MudData' -> Bool
moreVanityThanWealth md =
    let clothing = _clothingTbl md
        coins    = _coinsTbl    md
    in  ...

MudData and MudData' are almost identical to each other. MudDataMudData'几乎相同。 One of them wraps its fields in TVar s and the other one doesn't. 其中一个将其田地包裹在TVar ,而另一个则没有。 We can modify MudData so that it takes an extra type parameter (of kind * -> * ) for what to wrap the fields in. MudData will have the slightly unusual kind (* -> *) -> * , which is closely related to lenses but doesn't have much library support. 我们可以修改MudData以便它需要一个额外的类型参数(种类* -> * )来包装字段MudData将有一些不寻常的类型(* -> *) -> * ,这与镜头,但没有太多的图书馆支持。 I call this pattern a Model . 我称这种模式为模型

data MudData f = MudData { _armorTbl    :: f (IntMap Armor)
                         , _clothingTbl :: f (IntMap Clothing)
                         , _coinsTbl    :: f (IntMap Coins)

We can recover the original MudData with MudData TVar . 我们可以使用MudData TVar恢复原始的MudData We can recreate the pure version by wrapping the fields in Identity , newtype Identity a = Identity {runIdentity :: a} . 我们可以通过将字段包装在Identity来重新创建纯版本, newtype Identity a = Identity {runIdentity :: a} In terms of MudData Identity , our function would be written as MudData Identity ,我们的函数将被编写为

moreVanityThanWealth :: MudData Identity -> Bool
moreVanityThanWealth md =
    let clothing = runIdentity . _clothingTbl $ md
        coins    = runIdentity . _coinsTbl    $ md
    in  ...

We've successfully forgotten which parts of the MudData we've used, but now we don't have the lock granularity we want. 我们已经成功地忘记了我们使用的MudData哪些部分,但现在我们没有我们想要的锁粒度。 We need to recover, as a side effect, exactly what we just forgot. 作为副作用,我们需要恢复我们刚刚忘记的东西。 If we wrote the STM version of the helper it would look like 如果我们编写了帮助程序的STM版本,它看起来就像

moreVanityThanWealth :: MudData TVar -> STM Bool
moreVanityThanWealth md =
    do
        clothing <- readTVar . _clothingTbl $ md
        coins    <- readTVar . _coinsTbl    $ md
        return ...

This STM version for MudData TVar is almost exactly the same as the pure version we just wrote for MudData Identity . 这个用于MudData TVar STM版本与我们刚为MudData Identity编写的纯版本MudData Identity They only differ by the type of the reference ( TVar vs. Identity ), what function we use to get the values out of the references ( readTVar vs runIdentity ), and how the result is returned (in STM or as a plain value). 它们仅根据引用的类型( TVarIdentity )不同,我们使用什么函数从引用中获取值( readTVarrunIdentity ),以及返回结果的方式(在STM或作为普通值)。 It would be nice if the same function could be used to provide both. 如果可以使用相同的功能来提供两者,那将是很好的。 We are going to extract what is common between the two functions. 我们将提取两个函数之间的共同点。 To do so, we'll introduce a type class MonadReadRef rm for the Monad s we can read some type of reference from. 为此,我们将为Monad引入一个类型MonadReadRef rm ,我们可以从中读取某种类型的引用。 r is the type of the reference, readRef is the function to get the values out of the references, and m is how the result is returned. r是引用的类型, readRef是从引用中获取值的函数, m是返回结果的方式。 The following MonadReadRef is closely related to the MonadRef class from ref-fd . 以下MonadReadRefref-fd中MonadRef类密切相关。

{-# LANGUAGE FunctionalDependencies #-}

class Monad m => MonadReadRef r m | m -> r where
    readRef :: r a -> m a

As long as code is parameterized over all MonadReadRef rm s, it is pure. 只要代码在所有MonadReadRef rm参数化,它就是纯粹的。 We can see this by running it with the following instance of MonadReadRef for ordinary values held in an Identity . 我们可以通过使用以下MonadReadRef实例运行它来MonadReadRef IdentityMonadReadRef的普通值。 The id in readRef = id is the same as return . runIdentity idreadRef = id是一样的return . runIdentity return . runIdentity . return . runIdentity

instance MonadReadRef Identity Identity where
    readRef = id

We'll rewrite moreVanityThanWealth in terms of MonadReadRef . 我们将根据MonadReadRef重写moreVanityThanWealthMonadReadRef

moreVanityThanWealth :: MonadReadRef r m => MudData r -> m Bool
moreVanityThanWealth md =
    do
        clothing <- readRef . _clothingTbl $ md
        coins    <- readRef . _coinsTbl    $ md
        return ...

When we add a MonadReadRef instance for TVar s in STM , we can use these "pure" computations in STM but leak the side-effect of which TVar s were read. 当我们添加一个MonadReadRef实例TVar以s STM ,我们可以使用这些“纯”的计算STM但泄漏的副作用,其中TVar小号宣读。

instance MonadReadRef TVar STM where
    readRef = readTVar

Yes, this obviously makes your code complex and clutters the important code with a lot of boilerplate details. 是的,这显然会使您的代码变得复杂,并使重要的代码与许多样板详细信息混杂在一起。 And functions with more than 4 arguments are a sign of problems. 具有4个以上参数的函数是问题的标志。

I'd ask the question: Do you really gain anything by having separate TVar s? 我会问这样一个问题: 你是否通过单独的TVar获得了什么? Isn't it a case of premature optimization ? 是不是过早优化的情况? Before taking such a design decision as splitting your data structure among multiple separate TVar s, I'd definitely do some measurements (see criterion ). 在做出这样的设计决定之前,在多个独立的TVar分割您的数据结构之前,我肯定会做一些测量(参见标准 )。 You can create a sample test that models the expected number of concurrent threads and frequency of data updates and check what are you really gaining or losing by having multiple TVar s vs a single one vs an IORef . 您可以创建一个样本测试,对预期的并发线程数和数据更新频率进行建模,并通过将多个TVar与单个IORef对比IORef检查您真正获得或失去的是IORef

Keep in mind: 记住:

  • If there are multiple threads competing for common locks in a STM transaction, the transactions can get restarted several times before they manage to successfully complete. 如果在STM事务中有多个线程竞争公共锁,则事务可以在成功完成之前多次重新启动。 So under some circumstances, having multiple locks can actually make things worse. 所以在某些情况下,拥有多个锁实际上会使事情变得更糟。
  • If there is ultimately just one data structure that you need to synchronize, you might consider using a single IORef instead. 如果最终只需要同步一个数据结构,则可以考虑使用单个IORef It's atomic operations are very fast, which could compensate for having a single central lock. 它的原子操作非常快,可以补偿单个中央锁定。
  • In Haskell it's surprisingly difficult for a pure function to block an atomic STM or a IORef transaction for a long time. 在Haskell中,纯函数很长时间地阻塞原子STMIORef事务是非常困难的。 The reason is laziness: You only need to create thunks within such a transaction, not to evaluate them. 原因是懒惰:你只需要在这样的交易中创建thunk,而不是评估它们。 This is true in particular for a single atomic IORef . 对于单个原子IORef尤其IORef The thunks are evaluated outside such transactions (by a thread that inspects them, or you can decide to force them at some point, if you need more control; this can be desired in your case, as if your system evolves without anybody observing it, you can easily accumulate unevaluated thunks). 在这样的事务之外评估thunk(通过检查它们的线程,或者你可以决定在某些时候强制它们,如果你需要更多的控制;这在你的情况下是可取的,就像你的系统在没有任何人观察它的情况下进化一样,你很容易积累未评估的thunk)。

If it turns out that having multiple TVar s is indeed crucial, then I'd probably write all the code in a custom monad (as described by @Cirdec while I was writing my answer), whose implementation would be hidden from the main code, and which would provide functions for reading (and perhaps also writing) parts of the state. 如果事实证明拥有多个TVar确实至关重要,那么我可能会在自定义monad中编写所有代码(正如@Cirdec在我编写答案时所描述的那样),其实现将隐藏在主代码中,并且它将提供用于阅读(也可能还写作)州的一部分的功能。 It'd then be run as a single STM transaction, reading and writing only what's needed, and you could have a pure version of the monad for testing. 然后它将作为单个STM事务运行,只读取和写入所需的内容,并且您可以使用纯版本的monad进行测试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM