简体   繁体   中英

When manipulating immutable datastructures, what's the difference between Clojure's assoc-in and Haskell's lenses?

I need to manipulate and modify deeply nested immutable collections (maps and lists), and I'd like to better understand the different approaches. These two libraries solve more or less the same problem, right? How are they different, what types of problem is one approach more suitable for over the other?

Clojure's assoc-in
Haskell's lens

Clojure's assoc-in lets you specify a path through a nested data struture using integers and keywords and introduce a new value at that path. It has partners dissoc-in , get-in , and update-in which remove elements, get them without removal, or modify them respectively.

Lenses are a particular notion of bidirectional programming where you specify a linkage between two data sources and that linkage lets you reflect transformations from one to the other. In Haskell this means that you can build lenses or lens-like values which connect a whole data structure to some of its parts and then use them to transmit changes from the parts to the whole.

There's an analogy here. If we look at a use of assoc-in it's written like

(assoc-in whole path subpart)

and we might gain some insight by thinking of the path as a lens and assoc-in as a lens combinator. In a similar way you might write (using the Haskell lens package)

set lens subpart whole

so that we connect assoc-in with set and path with lens . We can also complete the table

set          assoc-in
view         get-in
over         update-in
(unneeded)   dissoc-in       -- this is special because `at` and `over`
                             -- strictly generalize dissoc-in

That's a start for similarities, but there's a huge dissimilarity, too. In many ways, lens is far more generic than the *-in family of Clojure functions are. Typically this is a non-issue for Clojure because most Clojure data is stored in nested structures made of lists and dictionaries. Haskell uses many more custom types very freely and its type system reflects information about them. Lenses generalize the *-in family of functions because they works smoothly over that far more complex domain.

First, let's embed Clojure types in Haskell and write the *-in family of functions.

type Dict a = Map String a

data Clj 
  = CljVal             -- Dynamically typed Clojure value, 
                       -- not an array or dictionary
  | CljAry  [Clj]      -- Array of Clojure types
  | CljDict (Dict Clj) -- Dictionary of Clojure types

makePrisms ''Clj

Now we can use set as assoc-in almost directly.

(assoc-in whole [1 :foo :bar 3] part)

set ( _CljAry  . ix 1 
    . _CljDict . ix "foo" 
    . _CljDict . ix "bar" 
    . _CljAry  . ix 3
    ) part whole

This somewhat obviously has a lot more syntactic noise, but it denotes a higher degree of explicitness about what the "path" into a datatype means, in particular it denotes whether we're descending into an array or a dictionary. We could, if we wanted, eliminate some of that extra noise by instantiating Clj in the Haskell typeclass Ixed , but it's hardly worth it at this point.

Instead, the point to be made is that assoc-in is applying to a very particular kind of data descent. It's more general than the types I laid out above due to Clojure's dynamic typing and overloading of IFn , but a very similar fixed structure like that could be embedded in Haskell with little further effort.

Lenses can go much further though, and do so with greater type safety. For instance, the example above is actually not a true "Lens" but instead a "Prism" or "Traversal" which allows the type system to statically identify the possibility of failing to make that traversal. It will force us to think about error conditions like that (even if we choose to ignore them).

Importantly that means that we can be sure when we have a true lens that datatype descent cannot fail—that kind of guarantee is impossible to make in Clojure.

We can define custom data types and make custom lenses which descend into them in a typesafe fashion.

data Point = 
  Point { _latitude  :: Double
        , _longitude :: Double
        , _meta      :: Map String String }
  deriving Show

makeLenses ''Point

> let p0 = Point 0 0
> let p1 = set latitude 3 p0
> view latitude p1
3.0
> view longitude p1
0.0
> let p2 = set (meta . ix "foo") "bar" p1
> preview (meta . ix "bar") p2
Nothing
> preview (meta . ix "foo") p2 
Just "bar"

We can also generalize to Lenses (really Traversals) which target multiple similar subparts all at once

dimensions :: Lens Point Double

> let p3 = over dimensions (+ 10) p0
> get latitude p3
10.0
> get longitude p3
10.0
> toListOf dimensions p3
[10.0, 10.0]

Or even target simulated subparts which don't actually exist but still form an equivalent description of our data

eulerAnglePhi   :: Lens Point Double
eulerAngleTheta :: Lens Point Double
eulerAnglePsi   :: Lens Point Double

Broadly, Lenses generalize the kind of path-based interaction between whole values and subparts of values that the Clojure *-in family of functions abstract. You can do a lot more in Haskell because Haskell has a much more developed notion of types and Lenses, as first class objects, widely generalize the notions of getting and setting that are simply presented with the *-in functions.

You're talking about very two different things.

You can use lens to solve similar problems as assoc-in, where you're using collection types ( Data.Map , Data.Vector ) that match the semantics but there are differences.

In untyped languages like Clojure it's common to structure your domain data in terms of collections that have non-static contents (hash-maps, vectors, etc) even when it's modeling data that is conventionally static.

In Haskell you would structure your data using a record and ADTs, where while you can express contents that might or might not exist (or wrap a collection), the default is statically known contents.

One library to look at would be http://hackage.haskell.org/package/lens-aeson where you have JSON documents which have possibly varying contents.

The examples demonstrate that when your path and type doesn't match the structure/data, it kicks out a Nothing instead of Just a .

Lens doesn't do anything beyond provide sound getter/setter behavior. It doesn't express a particular expectation about how your data looks, whereas assoc-in only makes sense with associative collections with possibly non-deterministic contents.

Another difference here is purity and laziness vs. strict and impure semantics. In Haskell, if you never used the "older" states, and only the most recent one, then only that value will be realized.

tl;dr lenses as found in Lens and other similar libraries are more general, more useful, type-safe, and especially nice in lazy/pure FP languages.

assoc-in can be more versatile than lens in some cases, because it can create levels in the structure if they don't exist.

lens offers Folds , that tear down the structure and return a summary of the contained values, and Traversals that modify elements in the structure (possibly targeting several elements at once, possibly doing nothing if the targeted element(s) are not present) while maintaining the structure's overall "shape". But I think it would be difficult to create intermediate levels using lens .

Another difference I see with the assoc-in -like functions in Clojure is that these seem to be only concerned with getting and setting values, while the very definition of a lens supports "doing something with the value", that something possibly involving side-effects.

For example, suppose we have a tuple (1,Right "ab") . The second component is a sum type that can contain a string. We want to change the first character of the string by reading it from console. This can be done with lenses as follows:

(_2._Right._Cons._1) (\_ -> getChar) (1,Right "ab")
-- reads char from console and returns the updated structure

If the string is not present, or is empty, nothing is done:

(_2._Right._Cons._1) (\_ -> getChar) (1,Left 5)
-- nothing read

(_2._Right._Cons._1) (\_ -> getChar) (1,Right "")
-- nothing read

This question is somewhat analogous to asking what's the difference between Clojure's for and Haskell's monads. I'll mimic the answers so far: sure for is sort of like a List monad, but monads are so much more generic and powerful.

But, this is somewhat silly, right? Monads have been implemented in Clojure. Why aren't they used all the time? Clojure has at its core a different philosophy about how to handle state, but still feels free to borrow good ideas from great languages like Haskell in its libraries.

So, sure, assoc-in , get-in , update-in , etc. are sort of like lenses for associative data structures. And there are implementations of lenses in general in Clojure out there. Why aren't they used all the time? It is a difference in philosophy (and perhaps the eerie feeling that with all the setters and getters we'd be making another Java inside Clojure and somehow end up marrying our mother). But, Clojure feels free to borrow good ideas, and you can see lens-inspired approaches making their way into cool projects like Om and Enliven.

You have to be careful asking such questions, because like half-siblings who occupy some of the same space Clojure and Haskell are bound to be borrowing from each other and squabbling a bit about who is right.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM