简体   繁体   English

余数据类型真的是终端代数吗?

[英]Are codatatypes really terminal algebras?

(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras). (免责声明:我不是 100% 确定 codatatype 是如何工作的,尤其是在不涉及终端代数时)。

Consider the "category of types", something like Hask but with whatever adjustment that fits the discussion.考虑“类型的类别”,类似于Hask ,但可以进行任何适合讨论的调整。 Within such a category, it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.在这样一个类别中,据说(1)初始代数定义数据类型,(2)终端代数定义余数据类型。

I'm struggling to convince myself of (2).我正在努力说服自己相信(2)。

Consider the functor T(t) = 1 + a * t .考虑函子T(t) = 1 + a * t I agree that the initial T -algebra is well-defined and indeed defines [a] , the list of a .我同意最初的T代数是明确定义的,并且确实定义了[a] ,即a的列表。 By definition, the initial T -algebra is a type X together with a function f:: 1+a*X -> X , such that for any other type Y and function g:: 1+a*Y -> Y , there is exactly one function m:: X -> Y such that m. f = g. T(m)根据定义,初始T代数是类型X和 function f:: 1+a*X -> X ,因此对于任何其他类型Y和 function g:: 1+a*Y -> Y ,有正好是一个 function m:: X -> Y使得m. f = g. T(m) m. f = g. T(m) m. f = g. T(m) (where . denotes the function combination operator as in Haskell). m. f = g. T(m) (其中.表示在 Haskell 中的 function 组合运算符)。 With f interpreted as the list constructor(s), g the initial value and the step function, and T(m) the recursion operation, the equation essentially asserts the unique existance of the function m given any initial value and any step function defined in g , which necessitates an underlying well-behaved fold together with the underlying type, the list of a . With f interpreted as the list constructor(s), g the initial value and the step function, and T(m) the recursion operation, the equation essentially asserts the unique existance of the function m given any initial value and any step function defined in g ,这需要一个行为良好的基础fold以及基础类型,即a的列表。

For example, g:: Unit + (a, Nat) -> Nat could be () -> 0 | (_,n) -> n+1例如, g:: Unit + (a, Nat) -> Nat可以是() -> 0 | (_,n) -> n+1 () -> 0 | (_,n) -> n+1 , in which case m defines the length function, or g could be () -> 0 | (_,n) -> 0 () -> 0 | (_,n) -> n+1 ,在这种情况下m定义长度 function,或者g可以是() -> 0 | (_,n) -> 0 () -> 0 | (_,n) -> 0 , then m defines a constant zero function. () -> 0 | (_,n) -> 0 ,然后m定义一个常数零 function。 An important fact here is that, for whatever g , m can always be uniquely defined, just as fold does not impose any contraint on its arguments and always produce a unique well-defined result.这里的一个重要事实是,对于任何gm始终可以唯一定义,就像fold不会对其 arguments 施加任何约束,并且始终会产生唯一的明确定义的结果。

This does not seem to hold for terminal algebras.这似乎不适用于终端代数。

Consider the same functor T defined above.考虑上面定义的相同的仿函数T The definition of the terminal T -algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m. g = f. T(m)终端T代数的定义与初始代数相同,不同之处在于m现在是X -> Y类型,方程现在变为m. g = f. T(m) m. g = f. T(m) m. g = f. T(m) . m. g = f. T(m) It is said that this should define a potentially infinite list.据说这应该定义一个潜在的无限列表。

I agree that this is sometimes true.我同意这有时是真的。 For example, when g:: Unit + (Unit, Int) -> Int is defined as () -> 0 | (_,n) -> n+1例如,当g:: Unit + (Unit, Int) -> Int定义为() -> 0 | (_,n) -> n+1 () -> 0 | (_,n) -> n+1 like before, m then behaves such that m(0) = () and m(n+1) = Cons () m(n) . () -> 0 | (_,n) -> n+1像以前一样,然后m的行为使得m(0) = ()m(n+1) = Cons () m(n) For non-negative n , m(n) should be a finite list of units.对于非负nm(n)应该是一个有限的单元列表。 For any negative n , m(n) should be of infinite length.对于任何负数nm(n)应该是无限长的。 It can be verified that the equation above holds for such g and m .可以验证上面的等式对于这样的gm成立。

With any of the two following modified definition of g , however, I don't see any well-defined m anymore.但是,对于以下两个修改后的g定义中的任何一个,我再也看不到任何定义明确的m了。

First, when g is again () -> 0 | (_,n) -> n+1首先,当g再次为() -> 0 | (_,n) -> n+1 () -> 0 | (_,n) -> n+1 but is of type g:: Unit + (Bool, Int) -> Int , m must satisfy that m(g((b,i))) = Cons bm(g(i)) , which means that the result depends on b . () -> 0 | (_,n) -> n+1但属于g:: Unit + (Bool, Int) -> Int类型, m必须满足m(g((b,i))) = Cons bm(g(i)) ,这意味着结果取决于b But this is impossible, because m(g((b,i))) is really just m(i+1) which has no mentioning of b whatsoever, so the equation is not well-defined.但这是不可能的,因为m(g((b,i)))实际上只是m(i+1)没有提到b ,所以方程没有明确定义。

Second, when g is again of type g:: Unit + (Unit, Int) -> Int but is defined as the constant zero function g _ = 0 , m must satisfy that m(g(())) = Nil and m(g(((),i))) = Cons () m(g(i)) , which are contradictory because their left hand sides are the same, both being m(0) , while the right hand sides are never the same.其次,当g再次是g:: Unit + (Unit, Int) -> Int类型但被定义为常数零 function g _ = 0时, m必须满足m(g(())) = Nilm(g(((),i))) = Cons () m(g(i)) ,这是矛盾的,因为它们的左侧是相同的,都是m(0) ,而右侧永远不是相同的。

In summary, there are T -algebras that have no morphism into the supposed terminal T -algebra, which implies that the terminal T -algebra does not exist.总之,有T -代数没有态射到假设的终端T -代数中,这意味着终端T -代数不存在。 The theoretical modeling of the codatatype Stream (or infinite list), if any, cannot be based on the nonexistant terminal algebra of the functor T(t) = 1 + a * t .如果有的话,codatatype Stream(或无限列表)的理论建模不能基于函子T(t) = 1 + a * t的不存在终端代数。

Many thanks to any hint of any flaw in the story above.非常感谢上面故事中任何缺陷的暗示。

(2) terminal algebras define codatatypes. (2) 终结代数定义了余数据类型。

This is not right, codatatypes are terminal coalgebras .这是不对的, codatatypes 是终端colgebras For your T functor, a coalgebra is a type x together with f:: x -> T x .对于您的T函子,余代数是xf:: x -> T x的类型。 A T -coalgebra morphism between (x1, f1) and (x2, f2) is a g:: x1 -> x2 such that fmap g. f1 = f2. g (x1, f1)(x2, f2)之间的T -coalgebra 态射是g:: x1 -> x2使得fmap g. f1 = f2. g fmap g. f1 = f2. g fmap g. f1 = f2. g . fmap g. f1 = f2. g Using this definition, the terminal T -algebra defines the possibly infinite lists (so-called "colists"), and the terminality is witnessed by the unfold function:使用这个定义,终端T -代数定义了可能的无限列表(所谓的“colists”),并且终端性由unfold function 见证:

unfold :: (x -> Unit + (a, x)) -> x -> Colist a

Note though that a terminal T -algebra does exist: it is simply the Unit type together with the constant function T Unit -> Unit (and this works as a terminal algebra for any T ).请注意,尽管确实存在终端T代数:它只是Unit类型以及常数 function T Unit -> Unit (这可以作为任何T的终端代数)。 But this is not very interesting for writing programs.但这对于编写程序并不是很有趣。

it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.据说(1)初始代数定义数据类型,(2)终端代数定义余数据类型。

On the second point, it is actually said that terminal coalgebras define codatatypes.关于第二点,实际上据说终端余代数定义了余数据类型。

A datatype t is defined by its constructors and a fold.数据类型t由其构造函数和折叠定义。

  • Constructors can be modelled by an algebra F t -> t (for example, the Peano constructors O: nat S: Nat -> Nat are collected as a single function in: Unit + Nat -> Nat ).构造函数可以通过代数F t -> t来建模(例如,Peano 构造函数O: nat S: Nat -> Nat被收集为单个 function in: Unit + Nat -> Nat )。
  • The fold then gives the catamorphism fold f: t -> x for any algebra f: F x -> x (for nats, fold: ((Unit + x) -> x) -> Nat -> x ).然后折叠给出变质fold f: t -> x对于任何代数f: F x -> x (对于 nats, fold: ((Unit + x) -> x) -> Nat -> x )。

A codatatype t is defined by its destructors and an unfold. codatatype t由它的析构函数和展开定义。

  • Destructors can be modelled by a coalgebra t -> F t (for example, streams have two destructors head: Stream a -> a and tail: Stream a -> Stream a , and they are collected as a single function out: Stream a -> a * Stream a ). Destructors can be modelled by a coalgebra t -> F t (for example, streams have two destructors head: Stream a -> a and tail: Stream a -> Stream a , and they are collected as a single function out: Stream a -> a * Stream a )。
  • The unfold then gives the anamorphism unfold f: x -> t for any coalgebra f: x -> F x (for streams, unfold: (x -> a * x) -> x -> Stream a ).然后展开给出变形unfold f: x -> t对于任何余代数f: x -> F x (对于流, unfold: (x -> a * x) -> x -> Stream a )。

(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras). (免责声明:我不是 100% 确定 codatatype 是如何工作的,尤其是在不涉及终端代数时)。

A codata type, or coinductive data type, is just one defined by its eliminations rather than its introductions .余数据类型或协导数据类型只是由它的消除而不是它的引入来定义的。

It seems that sometimes terminal algebra is used (very confusingly) to refer to a final coalgebra , which is what actually defines a codata type.似乎有时使用终端代数(非常令人困惑)来指代最终的 colgebra ,这实际上定义了余数据类型。

Consider the same functor T defined above.考虑上面定义的相同的仿函数 T。 The definition of the terminal T-algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m.终端 T 代数的定义与初始代数相同,不同之处在于 m 现在是 X -> Y 类型,方程现在变为 m。 g = f. g = f。 T(m). Tm值)。 It is said that this should define a potentially infinite list.据说这应该定义一个潜在的无限列表。

So I think this is where you've gone wrong: “ mg = fT ( m )” should be reversed, and read “ T ( m ) ∘ f = gm ”.所以我认为这是你出错的地方:“ mg = fT ( m )”应该颠倒过来,改为“ T ( m ) ∘ f = gm ”。 That is, the final coalgebra is defined by a carrier set S and a map g : ST ( S ) such that for any other coalgebra ( R , f : RT ( R )) there is a unique map m : RS such that T ( m ) ∘ f = gm . That is, the final coalgebra is defined by a carrier set S and a map g : ST ( S ) such that for any other coalgebra ( R , f : RT ( R )) there is a unique map m : RS使得T ( m ) ∘ f = gm

m is uniquely defined recursively by the map that returns Left () whenever f maps to Left () , and Right (x, m xs) whenever f maps to Right (x, xs) , ie it's the assignment of the coalgebra to its unique morphism to the final coalgebra, and denotes the unique anamorphism/unfold of this type, which should be easy to convince yourself is in fact a possibly-empty & possibly-infinite stream. m由 map 递归地唯一定义,当f映射到Left () ) 时返回Left () ,当f映射到 Right (x, xs) 时返回Right (x, m xs) Right (x, xs) ,即它是将余代数分配给其唯一态射到最终的余代数,并表示这种类型的独特变形/展开,这应该很容易说服自己实际上是一个可能为空且可能无限的 stream。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM