Promoting the arrow type

Dependent Haskell is becoming reality. With GHC 8.0 we have GADT constructor promotion, which allows for new forms of type constructors, but doing functional programming on types remains awkward because the type-level analgoues of term-level functions–the type families–are not types at all. The first-class functions which make functional programming such a joy simply are not present at the type-level in GHC’s Haskell.

The 8.0 release permits a simple construction for making type families appear to be first-class. The idea is to proxy them with actual data types. For every type-level function, there is a type constructor, and a type family instance defined on that type constructor which actually implements the function. The simplest example is the identity function:

    type Id = F IdProxy
    data IdProxy (x :: k) (p :: Proxy k)
    type instance EvalFunction (IdProxy x) = x

Type family proxies like IdProxy always follow a certain form: their final argument must be a Proxy which indicates the kind of thing it returns when evaluated with EvalFunction. This is plain in the definition of that family:

    type family EvalFunction (d :: Proxy l -> Type) :: l

The type family F, used to define Id, is a kind of type-level smart constructor, which uses a type family proxy to come up with a type in the kind Function, which contains not only these promoted type families, but also the type constructors and promoted data constructors which already behave, more or less, like first-class functions:

    data Function (s :: Type) (t :: Type) where
        Constructor :: (k -> l) -> Function s t
        Family :: (k -> l) -> Function s t

    infixr 0 :->
    type s :-> t = Function s t

There’s another smart constructor C for type and data constructors. It’s no mystery why we need these: in either of the Function constructors there is no relationship between the arrow kinded type k -> l and the parameters s and t.

As function construction is handled by type families F and C, function deconstruction (application) is performed by a type family called At.

    Id `At` 'True ~ 'True

With these ingredients, we can do functional programming with types:

    type Plus = F PlusProxy
    data PlusProxy (n :: Nat) (m :: Nat) (p :: Proxy Nat)
    type instance EvalFunction (PlusProxy n m) = n + m

    type Foldr = F FoldrProxy
    data FoldrProxy (f :: a :-> b :-> b) (r :: b) (xs :: [a]) (p :: Proxy b)
    type instance EvalFunction (FoldrProxy f r '[]) = r
    type instance EvalFunction (FoldrProxy f r (x ': xs)) =
        f `At` x `At` (Foldr `At` f `At` r `At` xs)

    -- Six ~ 6 :: Nat
    type Six = Foldr `At` Plus `At` 0 `At` '[1,2,3]

We can even do ad-hoc polymorphism and give analogues of fmap, pure, <*>, and >>=. Check out the repository for more.


Diplomacy

This is an announcement of the diplomacy and diplomacy-server packages. The former is a library with definitions relevant to the famous board game called Diplomacy, and the latter is an implementation of the game atop that library.

There are plenty of Diplomacy implementations out there. I’ve created this one for at least two reasons:

  1. Diplomacy is a board game with a rule book. I wondered, can’t the rule book simply be transcribed into a programming language and voilà, an implementation? After all, the rule book is supposedly a specification of the game, and so is any correct implementation of the game.
  2. This implementation allows for what I think is a pretty cool style of play: get together with your friends, start the server and create a game, and have each player control their forces from their own private devices (smartphones or laptops).

For the first point, I was dead wrong. Even in an extremely high-level language like Haskell, it’s very hard to transcribe the Diplomacy rule book into a program. I still wonder exactly why this is the case. The fact that Diplomacy in particular has a few ambiguities in the rules does not help, but the DATC deals with these, so that’s not an essential problem. My best explanation for the difficulty–the same difficulty we have with transcribing a product manager’s spec into a working product–is that there’s some gap in the specification which our brains fill in, but which leaves our programs incomplete. It’s straightforward to transcribe all of the rules of a board game into well-typed Haskell, but apparently that’s not enough: we need algorithms to work with these rules which handle all cases, whereas it seems human interpretation can settle for a few examples and infer an algorithm for game play.


Pictures of mountains and lakes 6

Andy Bailey Lake in the evening.

Andy Bailey Lake in the evening.

Andy Bailey Lake in the morning.

Andy Bailey Lake in the morning.

Alberta highway 1A near Canmore.

Alberta highway 1A near Canmore.

Some vista in Canmore.

Some vista in Canmore.


Pictures of mountains and lakes 5

Atlin from the yacht club dock.

Atlin from the yacht club dock.

A boat in Atlin

A boat in Atlin

Another side of Atlin

Another side of Atlin

Palmer Lake in the evening.

Palmer Lake in the evening.

Waterloo and Kitchener at the signpost forest in Watson Lake.

Waterloo and Kitchener at the signpost forest in Watson Lake.

Muncho Lake from the Alaska Highway.

Muncho Lake from the Alaska Highway.

Muncho Lake with more context.

Muncho Lake with more context.

Some clouds from Grande Cache.

Some clouds from Grande Cache.


Pictures of mountains and lakes 4

Looking south from the Carcross Desert

Looking south from the Carcross Desert

Klondike Highway south of Carcross. I think that's Bove Island on the left.

Klondike Highway south of Carcross. I think that’s Bove Island on the left.

The way home from Skagway

The way home from Skagway

The beach at Carcross. Maybe my favourite place in Yukon.

The beach at Carcross. Maybe my favourite place in Yukon.

The view from the highway at Carcross.

The view from the highway at Carcross.

Miles Canyon, near Whitehorse.

Miles Canyon, near Whitehorse.


Pictures of mountains and lakes 3

A boat in Whitehorse

A boat in Whitehorse

The architecture in Dawson City

The architecture in Dawson City

Midnight in Dawson City

Midnight in Dawson City

IMG_0641

Leaving wild Dawson City

Leaving wild Dawson City

Five Finger Rapids

Five Finger Rapids

On the way to Carcross

On the way to Carcross

Emerald Lake from the Klondike highway

Emerald Lake from the Klondike highway


Pictures of mountains and lakes 2

A Stewart hybrid

A Stewart hybrid

Travelling north on 37

Travelling north on 37

My favourite vista on 37, near Dease Lake

My favourite vista on 37, near Dease Lake

Tatogga Lake

Tatogga Lake

Eerie Dease Lake

Eerie Dease Lake

The Yukon River at Whitehorse

The Yukon River at Whitehorse


Pictures of mountains and lakes 1

CRAB park in Vancouver

Some vista along highway 99

Some vista along highway 99

Highway 1 north of Yale, looking south

Highway 1 north of Yale, looking south

A mural in Quesnel

A mural in Quesnel

A harbour in Prince Rupert

A harbour in Prince Rupert

A mural in Prince Rupert

A mural in Prince Rupert

A cow in Prince Rupert, next to the mural and the great Opa Sushi

A cow in Prince Rupert, next to the mural and the great Opa Sushi

Highway 37 begins

Highway 37 begins

Some vista from highway 37A

Some vista from highway 37A

The eerie town of Stewart

The eerie town of Stewart


Interpreting Free Monads of Functor Sums

Find the source file for this post here. Really, that gist is formatted way nicer than this post, so I recommend you read it there. WordPress doesn’t seem to let me include non-inline CSS in my posts.

This text deals with a way to compose certain kinds of monads, thereby mixing their capabilities. It is a literate Haskell file, so let’s begin with a bunch of noise.

> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE FlexibleInstances #-}
> {-# LANGUAGE FlexibleContexts #-}
> {-# LANGUAGE RankNTypes #-}
> {-# LANGUAGE KindSignatures #-}
> {-# LANGUAGE GADTs #-}
> {-# LANGUAGE GeneralizedNewtypeDeriving #-}
> {-# LANGUAGE DeriveFunctor #-}
> {-# LANGUAGE TypeOperators #-}
> {-# LANGUAGE OverlappingInstances #-}
> 
> import Prelude hiding (log)
> import Control.Applicative
> import Control.Monad
> import Control.Monad.Free
> import Control.Monad.IO.Class
> import Control.Monad.Trans.Class
> import Control.Monad.Trans.Identity
> import Control.Monad.Trans.Reader

A motivating problem

Here are two functors and their free monads which define DSLs for addition of integers, and for logging.

> data PlusF a where
>    Plus :: Int -> Int -> (Int -> a) -> PlusF a
>
> instance Functor PlusF where
>     fmap f (Plus left right next) = Plus left right (fmap f next)
>
> data LogF s a where
>     Log :: s -> a -> LogF s a
>
> instance Functor (LogF s) where
>     fmap f (Log s next) = Log s (f next)
>
> type Plus = Free PlusF
> type Log s = Free (LogF s)
>
> plus :: Int -> Int -> Plus Int
> plus i j = liftF $ Plus i j id
>
> log :: s -> Log s ()
> log s = liftF $ Log s ()

Either of these free monads can be interpreted using iterM and an appropriate interpreter function.

> type Interpreter f m = forall a . f (m a) -> m a
> type TypeOfIterM = forall f m a . (Functor f, Monad m) => Interpreter f m -> Free f a -> m a

A Log String term, for example, can be interpreted by IO:

> stdoutLog' :: Interpreter (LogF String) IO
> stdoutLog' term = case term of
>     Log str next -> putStrLn str >> next
>
> runStdoutLog :: Log String t -> IO t
> runStdoutLog = iterM stdoutLog'

Similarly, we could interpet Plus using a reader which holds a modulus like so:

> modularPlus' :: Interpreter PlusF (Reader Int)
> modularPlus' term = case term of
>     Plus left right next -> do
>         modulus <- ask
>         next ((left + right) `mod` modulus)
>         
> runModularPlus :: Plus t -> Reader Int t
> runModularPlus = iterM modularPlus'

But suppose we want to mix Plus and Log String, so that we can add things and also log strings. We have interpreters for each of these; it would be nice to use them to define an interpreter for the composite. This text shows one way to achieve this.

A solution

To begin, we need the tools to define the monad composite of Plus and Log. The functor sum allows us to build new functors from old, and free monads on these sums are monads which contain the terms of each summand functor, which is just what we need.

> infixr 8 :+:
> data (f :+: g) a = SumL (f a) | SumR (g a)
>
> instance (Functor f, Functor g) => Functor (f :+: g) where
>     fmap f term = case term of
>         SumL x -> SumL (fmap f x)
>         SumR x -> SumR (fmap f x)

For any functor f, we can always inject it into a free monad over a functor sum in which it appears as a summand.

> injFL :: Functor f => Free f a -> Free (f :+: g) a
> injFR :: Functor f => Free f a -> Free (g :+: f) a

These are easy to spot:

> injFL term = case term of
>     Pure x -> Pure x
>     Free x -> Free (SumL (fmap injFL x))
>
> injFR term = case term of
>     Pure x -> Pure x
>     Free x -> Free (SumR (fmap injFR x))

Using typeclasses, we can extend this to sums of more than two summands, just like a construction from datatypes à la carte.

> class InjectSum f g where
>     inject :: f a -> g a
> 
> instance InjectSum f f where
>     inject = id
> 
> instance InjectSum f (f :+: h) where
>     inject = SumL
> 
> instance (InjectSum f h) => InjectSum f (g :+: h) where
>     inject = SumR . inject
>
> injectF_ :: (Functor g, InjectSum f g) => f a -> Free g a
> injectF_ = liftF . inject
>
> injectF :: (Functor f, Functor g, InjectSum f g) => Free f a -> Free g a
> injectF term = case term of
>     Pure a -> Pure a
>     Free fterm -> Free (inject (fmap injectF fterm))

Now we can tell GHC of our composite monad, LogPlus, and write terms in it.

> type LogPlusF = LogF String :+: PlusF
> type LogPlus = Free LogPlusF
>
> logPlusTerm :: Int -> LogPlus Int
> logPlusTerm i = do
>     injectF $ log ("Input is " ++ (show i))
>     sum <- injectF $ plus i 42
>     injectF $ log ("Sum computed successfully!")
>     return sum

But what use is this, when we have no way to interpret a LogPlus? We need something of type LogPlusF (m a) -> m a, and we need to somehow produce it from the existing interpreters for Log and Plus. If we could do this, then what would m be in the type above? It must in some sense contain the smaller interpreters, IO for Log and Reader Int for Plus. I can think of one such monad, namely:

> type CompositeInterpreter = IdentityT (ReaderT Int IO) 

And this begs us to think that maybe, if both functors in a sum have interpreters which are monad transformers, then the interpreter for their sum is these two transformers stacked atop one-another. The type :&: is used to denote this stacking, so that (IdentityT :&: ReaderT Int) IO = CompositeInterpreter.

> infixr 8 :&:
> newtype ((m :: (* -> *) -> * -> *) :&: (n :: (* -> *) -> * -> *)) (s :: * -> *) t = Trans {
>     runTrans :: m (n s) t
>   } deriving (Functor, Applicative, Monad, MonadIO)

If we rephrase the interpreters for Log and Plus as transformers, then we can write the type of the Interpreter which can handle their sum.

> newtype ModularPlusInterpreter m t = MPI {
>     runMPI :: ReaderT Int m t
>   } deriving (Functor, Applicative, Monad, MonadIO, MonadTrans)
>
> newtype StdoutLogInterpreter m t = SLI {
>    runSLI :: IdentityT m t
>  } deriving (Functor, Applicative, Monad, MonadIO, MonadTrans)
>
> modularPlus :: Monad m => Interpreter (PlusF) (ModularPlusInterpreter m)
> modularPlus term = case term of
>     Plus left right next -> do
>         modulus <- MPI ask
>         next ((left + right) `mod` modulus)
>
> stdoutLog :: MonadIO m => Interpreter (LogF String) (StdoutLogInterpreter m)
> stdoutLog term = case term of
>     Log str next -> do
>         SLI (liftIO (putStrLn str))
>         next
>
> stdoutLogAndModularPlus
>   :: Interpreter
>        (LogF String :+: PlusF)
>        ((StdoutLogInterpreter :&: ModularPlusInterpreter) IO)

A value for that last type isn’t obvious. It ought to be automatic, somehow inferred by GHC given the two smaller Interpreters. This calls for a typeclass which determines an Interpreter. This class FInterpreter m b f can be read as “m interprets f in base monad b”.

> class Functor f => FInterpreter (m :: (* -> *) -> * -> *) (b :: * -> *) (f :: * -> *) where
>     finterpret :: Interpreter f (m b)

Now we need to give an instance for :&:. We want m :&: n to interpret a functor sum f :+: g over some base b, but of course we have to qualify this with some assumptions:

  • n interprets g over b
  • m interprets f over n b
  • m, n give monads when stacked over b
  • m can be “stripped off” and then reset, as described by the class FTrans.

That last assumption probably means that not all monad transformers can be used as interpreters, but it’s nothing to worry about, because IdentityT and ReaderT r fulfill it, and these are just the monads we use for this demonstration.

> instance
>     ( FInterpreter n b g
>     , FInterpreter m (n b) f
>     , Monad b
>     , Monad (n b)
>     , Monad (m (n b))
>     , FTrans m
>     ) => FInterpreter (m :&: n) b (f :+: g)
>   where
>     finterpret term = case term of
>         SumL left -> do
>             let term' = fmap runTrans left
>             Trans (finterpret term')
>         SumR right -> do 
>             -- This is where FTrans is needed.
>             -- We can strip off m inside the functor g, interpret using
>             -- FInterpreter n b g, then return to m :&: n using inFTrans.
>             outR <- outFTrans
>             let term' = fmap outR right
>             inFTrans (finterpret term')
>
> class FTrans (m :: (* -> *) -> * -> *) where
>     outFTrans :: (Monad (m (n b)), Monad (n b)) => (m :&: n) b ((m :&: n) b t -> n b t)
>     inFTrans :: (Monad (m (n b))) => n b t -> (m :&: n) b t
>
> instance FTrans IdentityT where
>     outFTrans = Trans (IdentityT (return (runIdentityT . runTrans)))
>     inFTrans = Trans . IdentityT
>
> instance FTrans (ReaderT r) where
>     outFTrans = Trans (ReaderT (\r -> return (\x -> runReaderT (runTrans x) r)))
>     inFTrans = Trans . ReaderT . const

With FInterpreter and FTrans instances for our interpreters, we have enough machinery to give a value for stdoutLogAndModularPlus.

> instance Monad m => FInterpreter ModularPlusInterpreter m PlusF where
>     finterpret = modularPlus
>
> instance MonadIO m => FInterpreter StdoutLogInterpreter m (LogF String) where
>     finterpret = stdoutLog
>
> instance FTrans (ModularPlusInterpreter) where
>     outFTrans = Trans (MPI (ReaderT (\r -> return (\x -> runReaderT (runMPI (runTrans x)) r))))
>     inFTrans = Trans . MPI . ReaderT . const
>
> instance FTrans (StdoutLogInterpreter) where
>     outFTrans = Trans (SLI (IdentityT (return (runIdentityT . runSLI . runTrans))))
>     inFTrans = Trans . SLI . IdentityT
>
> stdoutLogAndModularPlus = finterpret

Actually, we don’t need FTrans ModularPlusInterpreter, because it sits below (on the right of) StdoutLogInterpreter in the composite interpreter, but it’s good to have anyway.

We can now interpret terms of Free (LogF :+: PlusF). This demands running every transformer in the stack, in-order, and stripping of the :&: constructors via runTrans, like so:

> step1 :: (:&:) StdoutLogInterpreter ModularPlusInterpreter IO Int
> step1 = iterM stdoutLogAndModularPlus (logPlusTerm 5)
> 
> step2 :: ModularPlusInterpreter IO Int
> step2 = runIdentityT . runSLI . runTrans $ step1
>
> step3 :: Int -> IO Int
> step3 i = (flip runReaderT) i . runMPI $ step2

And that’s it! Modular arithmetic with logging to stdout, arising from its parts.

I’m excited about this technique because it may open up a kind of normal form for programs, in which all business logic is expressed by DSLs defined by functors, and these DSLs are mixed via functor sums to give the program’s principal DSL, which is then interpreted by choosing interpreters for each part.


Parsing with more proofs

Earlier this month I showed a string parser implementation with machine-checked proof that the remaining unparsed string is always a tail of the input string. It’s good to know that parsers of this type are well behaved, and it will probably help you sleep at night, but defining a parser for a particular application reveals still unchecked properties.

Suppose for example that we have some datatype that we wish to serialize and deserialize. This comes up often enough in the real world, but instead of a real world, practical example, I’ll give a lesson in Canadian geography:


data Province
  = NewfoundlandAndLabrador
  | NovaScotia
  | PrinceEdwardIsland
  | NewBrunswick
  | Quebec
  | Ontario
  | Manitoba
  | Saskatchewan
  | Alberta
  | BritishColumbia

showProvince : Province -> List Char
showProvince NewfoundlandAndLabrador = unpack "Nfld"
showProvince NovaScotia = unpack "NS"
showProvince PrinceEdwardIsland = unpack "PEI"
showProvince NewBrunswick = unpack "NB"
showProvince Quebec = unpack "Que"
showProvince Ontario = unpack "Ont"
showProvince Manitoba = unpack "Man"
showProvince Saskatchewan = unpack "Sas"
showProvince Alberta = unpack "Alb"
showProvince BritishColumbia = unpack "BC"

Defined above is a type which represents all ten of Canada’s provinces, and a function which embeds them in the List Char type. The unpack function is used to convert from Idris’s string type, because my string parser actually expects a List Char, not a string.

With showProvince, we’re able to present a user with an abbreviated yet human-readable name for each province. To allow a user to present a province to the computer, we can take string input through a parser!


parseNewfoundlandAndLabrador : StringParser Province
parseNewfoundlandAndLabrador = string "Nfld" $> pure NewfoundlandAndLabrador

parseNovaScotia : StringParser Province
parseNovaScotia = string "NS" $> pure NovaScotia

parsePrinceEdwardIsland : StringParser Province
parsePrinceEdwardIsland = string "PEI" $> pure PrinceEdwardIsland

parseNewBrunswick : StringParser Province
parseNewBrunswick = string "NB" $> pure NewBrunswick

parseQuebec : StringParser Province
parseQuebec = string "Que" $> pure Quebec

parseOntario : StringParser Province
parseOntario = string "Ont" $> pure Ontario

parseManitoba : StringParser Province
parseManitoba = string "Man" $> pure Manitoba

parseSaskatchewan : StringParser Province
parseSaskatchewan = string "Sas" $> pure Saskatchewan

parseAlberta : StringParser Province
parseAlberta = string "Alb" $> pure Alberta

parseBritishColumbia : StringParser Province
parseBritishColumbia = string "BC" $> pure BritishColumbia

parseProvince : StringParser Province
parseProvince =
  parseNewfoundlandAndLabrador
  <|> parseNovaScotia
  <|> parsePrinceEdwardIsland
  <|> parseNewBrunswick
  <|> parseQuebec
  <|> parseOntario
  <|> parseManitoba
  <|> parseSaskatchewan
  <|> parseAlberta
  <|> parseBritishColumbia

Some definitions in the last snippet may look unfamiliar. They’re from the repository linked in the previous post, and Idris’s applicative functor module.

With parseProvince in place, we have in hand our serializer and deserializer combo:


serialize : Province -> List Char
serialize = showProvince

-- The parser is parseProvince <$ endOfInput because a deserializer
-- must consume the entire input string.
deserialize : List Char -> Maybe Province
deserialize cs = parsedValue (runStringParser (parseProvince <$ endOfInput) cs)

but this is not enough! Sure, we call these functions by very suggestive names, indicating that they lie in some useful relationship, but we haven’t actually checked that they in fact merit their names. If a serializer/deserializer pair is to be of any use, they must be in some sense compatible: we expect deserialize to be left-inverse to serialize, through the Maybe. More formally, we want:


forall p : Province . deserialize (serialize p) = Just p

In this example case of 10 Canadian province abbreviations, it’s not hard to convince yourself that this equality holds; in real world examples with more complicated datatypes, it may be more difficult. Fortunately for us, Idris can help! Here’s the proposition and proof:


leftInverse : (p : Province) -> (deserialize (serialize p) = Just p)
leftInverse NewfoundlandAndLabrador = Refl
leftInverse NovaScotia = Refl
leftInverse PrinceEdwardIsland = Refl
leftInverse NewBrunswick = Refl
leftInverse Quebec = Refl
leftInverse Ontario = Refl
leftInverse Manitoba = Refl
leftInverse Saskatchewan = Refl
leftInverse Alberta = Refl
leftInverse BritishColumbia = Refl

By writing Refl as the value for each case, we assert that the left- and right-hand sides of the = in the type will in fact be syntactically the same, given the particular Province matched on the left-hand side of each case. When we hand this program off to Idris, it checks that this is actually true. So, if Idris certifies this program well-typed, then it confirms just what we want to know: deserialize is left-inverse to serialize.

Finally, we can package up our combo in a datatype:


data Serializable : a -> Type where
  MkSerializable
    :  (serialize : a -> List Char)
    -> (deserialize : List Char -> Maybe a)
    -> (leftInverse : ((x : a) -> deserialize (serialize x) = Just x))
    -> Serializable a

so that whenever we have a Serializable a in hand, we know that we can serialize and deserialize values of type a in a well-behaved way. If we could not, then that value wouldn’t be well-typed! We can show at least one inhabitant of Serializable a:


provinceSerializable : Serializable Province
provinceSerializable = MkSerializable serialize deserialize leftInverse

Find the full source file here and run it for yourself! But start the process before you leave to have your lunch, because it takes about 20 minutes. However, if you change the serializer or deserializer so that they’re clearly not compatible, Idris very quickly discovers this and quits. I wonder what’s going on under the hood.

Of course, we could also check this left-inverse property via judiciously designed tests, but the formal method shown here has at least two advantages over testing:

  1. It is tightly integrated with the build: if the property doesn’t hold, the program doesn’t compile. With tests, it’s possible to obtain a build for which the property doesn’t hold.
  2. Tests can go stale, types cannot. If we change our datatype, we can get away with not updating our tests, but we can’t bypass the proof obligation built-in to the program here.

I’m excited to see methods like this applied in industry in the years to come!