Promoting the arrow type
Posted: 28/01/2016 Filed under: Computer science, Dependent Types, GHC, Haskell, Programming Leave a commentDependent Haskell is becoming reality. With GHC 8.0 we have GADT constructor promotion, which allows for new forms of type constructors, but doing functional programming on types remains awkward because the type-level analgoues of term-level functions–the type families–are not types at all. The first-class functions which make functional programming such a joy simply are not present at the type-level in GHC’s Haskell.
The 8.0 release permits a simple construction for making type families appear to be first-class. The idea is to proxy them with actual data types. For every type-level function, there is a type constructor, and a type family instance defined on that type constructor which actually implements the function. The simplest example is the identity function:
type Id = F IdProxy data IdProxy (x :: k) (p :: Proxy k) type instance EvalFunction (IdProxy x) = x
Type family proxies like IdProxy always follow a certain form: their final argument must be a Proxy which indicates the kind of thing it returns when evaluated with EvalFunction. This is plain in the definition of that family:
type family EvalFunction (d :: Proxy l -> Type) :: l
The type family F, used to define Id, is a kind of type-level smart constructor, which uses a type family proxy to come up with a type in the kind Function, which contains not only these promoted type families, but also the type constructors and promoted data constructors which already behave, more or less, like first-class functions:
data Function (s :: Type) (t :: Type) where Constructor :: (k -> l) -> Function s t Family :: (k -> l) -> Function s t infixr 0 :-> type s :-> t = Function s t
There’s another smart constructor C for type and data constructors. It’s no mystery why we need these: in either of the Function constructors there is no relationship between the arrow kinded type k -> l and the parameters s and t.
As function construction is handled by type families F and C, function deconstruction (application) is performed by a type family called At.
Id `At` 'True ~ 'True
With these ingredients, we can do functional programming with types:
type Plus = F PlusProxy data PlusProxy (n :: Nat) (m :: Nat) (p :: Proxy Nat) type instance EvalFunction (PlusProxy n m) = n + m type Foldr = F FoldrProxy data FoldrProxy (f :: a :-> b :-> b) (r :: b) (xs :: [a]) (p :: Proxy b) type instance EvalFunction (FoldrProxy f r '[]) = r type instance EvalFunction (FoldrProxy f r (x ': xs)) = f `At` x `At` (Foldr `At` f `At` r `At` xs) -- Six ~ 6 :: Nat type Six = Foldr `At` Plus `At` 0 `At` '[1,2,3]
We can even do ad-hoc polymorphism and give analogues of fmap, pure, <*>, and >>=. Check out the repository for more.
Diplomacy
Posted: 20/08/2015 Filed under: Computer science, Haskell Leave a commentThis is an announcement of the diplomacy and diplomacy-server packages. The former is a library with definitions relevant to the famous board game called Diplomacy, and the latter is an implementation of the game atop that library.
There are plenty of Diplomacy implementations out there. I’ve created this one for at least two reasons:
- Diplomacy is a board game with a rule book. I wondered, can’t the rule book simply be transcribed into a programming language and voilà, an implementation? After all, the rule book is supposedly a specification of the game, and so is any correct implementation of the game.
- This implementation allows for what I think is a pretty cool style of play: get together with your friends, start the server and create a game, and have each player control their forces from their own private devices (smartphones or laptops).
For the first point, I was dead wrong. Even in an extremely high-level language like Haskell, it’s very hard to transcribe the Diplomacy rule book into a program. I still wonder exactly why this is the case. The fact that Diplomacy in particular has a few ambiguities in the rules does not help, but the DATC deals with these, so that’s not an essential problem. My best explanation for the difficulty–the same difficulty we have with transcribing a product manager’s spec into a working product–is that there’s some gap in the specification which our brains fill in, but which leaves our programs incomplete. It’s straightforward to transcribe all of the rules of a board game into well-typed Haskell, but apparently that’s not enough: we need algorithms to work with these rules which handle all cases, whereas it seems human interpretation can settle for a few examples and infer an algorithm for game play.
Interpreting Free Monads of Functor Sums
Posted: 21/05/2015 Filed under: Computer science, Haskell, Programming Leave a commentFind the source file for this post here. Really, that gist is formatted way nicer than this post, so I recommend you read it there. WordPress doesn’t seem to let me include non-inline CSS in my posts.
This text deals with a way to compose certain kinds of monads, thereby mixing their capabilities. It is a literate Haskell file, so let’s begin with a bunch of noise.
> {-# LANGUAGE MultiParamTypeClasses #-} > {-# LANGUAGE FlexibleInstances #-} > {-# LANGUAGE FlexibleContexts #-} > {-# LANGUAGE RankNTypes #-} > {-# LANGUAGE KindSignatures #-} > {-# LANGUAGE GADTs #-} > {-# LANGUAGE GeneralizedNewtypeDeriving #-} > {-# LANGUAGE DeriveFunctor #-} > {-# LANGUAGE TypeOperators #-} > {-# LANGUAGE OverlappingInstances #-} > > import Prelude hiding (log) > import Control.Applicative > import Control.Monad > import Control.Monad.Free > import Control.Monad.IO.Class > import Control.Monad.Trans.Class > import Control.Monad.Trans.Identity > import Control.Monad.Trans.Reader
A motivating problem
Here are two functors and their free monads which define DSLs for addition of integers, and for logging.
> data PlusF a where > Plus :: Int -> Int -> (Int -> a) -> PlusF a > > instance Functor PlusF where > fmap f (Plus left right next) = Plus left right (fmap f next) > > data LogF s a where > Log :: s -> a -> LogF s a > > instance Functor (LogF s) where > fmap f (Log s next) = Log s (f next) > > type Plus = Free PlusF > type Log s = Free (LogF s) > > plus :: Int -> Int -> Plus Int > plus i j = liftF $ Plus i j id > > log :: s -> Log s () > log s = liftF $ Log s ()
Either of these free monads can be interpreted using iterM
and an appropriate interpreter function.
> type Interpreter f m = forall a . f (m a) -> m a > type TypeOfIterM = forall f m a . (Functor f, Monad m) => Interpreter f m -> Free f a -> m a
A Log String
term, for example, can be interpreted by IO
:
> stdoutLog' :: Interpreter (LogF String) IO > stdoutLog' term = case term of > Log str next -> putStrLn str >> next > > runStdoutLog :: Log String t -> IO t > runStdoutLog = iterM stdoutLog'
Similarly, we could interpet Plus
using a reader which holds a modulus like so:
> modularPlus' :: Interpreter PlusF (Reader Int) > modularPlus' term = case term of > Plus left right next -> do > modulus <- ask > next ((left + right) `mod` modulus) > > runModularPlus :: Plus t -> Reader Int t > runModularPlus = iterM modularPlus'
But suppose we want to mix Plus
and Log String
, so that we can add things and also log strings. We have interpreters for each of these; it would be nice to use them to define an interpreter for the composite. This text shows one way to achieve this.
A solution
To begin, we need the tools to define the monad composite of Plus
and Log
. The functor sum allows us to build new functors from old, and free monads on these sums are monads which contain the terms of each summand functor, which is just what we need.
> infixr 8 :+: > data (f :+: g) a = SumL (f a) | SumR (g a) > > instance (Functor f, Functor g) => Functor (f :+: g) where > fmap f term = case term of > SumL x -> SumL (fmap f x) > SumR x -> SumR (fmap f x)
For any functor f
, we can always inject it into a free monad over a functor sum in which it appears as a summand.
> injFL :: Functor f => Free f a -> Free (f :+: g) a > injFR :: Functor f => Free f a -> Free (g :+: f) a
These are easy to spot:
> injFL term = case term of > Pure x -> Pure x > Free x -> Free (SumL (fmap injFL x)) > > injFR term = case term of > Pure x -> Pure x > Free x -> Free (SumR (fmap injFR x))
Using typeclasses, we can extend this to sums of more than two summands, just like a construction from datatypes à la carte.
> class InjectSum f g where > inject :: f a -> g a > > instance InjectSum f f where > inject = id > > instance InjectSum f (f :+: h) where > inject = SumL > > instance (InjectSum f h) => InjectSum f (g :+: h) where > inject = SumR . inject > > injectF_ :: (Functor g, InjectSum f g) => f a -> Free g a > injectF_ = liftF . inject > > injectF :: (Functor f, Functor g, InjectSum f g) => Free f a -> Free g a > injectF term = case term of > Pure a -> Pure a > Free fterm -> Free (inject (fmap injectF fterm))
Now we can tell GHC of our composite monad, LogPlus
, and write terms in it.
> type LogPlusF = LogF String :+: PlusF > type LogPlus = Free LogPlusF > > logPlusTerm :: Int -> LogPlus Int > logPlusTerm i = do > injectF $ log ("Input is " ++ (show i)) > sum <- injectF $ plus i 42 > injectF $ log ("Sum computed successfully!") > return sum
But what use is this, when we have no way to interpret a LogPlus
? We need something of type LogPlusF (m a) -> m a
, and we need to somehow produce it from the existing interpreters for Log
and Plus
. If we could do this, then what would m
be in the type above? It must in some sense contain the smaller interpreters, IO
for Log
and Reader Int
for Plus
. I can think of one such monad, namely:
> type CompositeInterpreter = IdentityT (ReaderT Int IO)
And this begs us to think that maybe, if both functors in a sum have interpreters which are monad transformers, then the interpreter for their sum is these two transformers stacked atop one-another. The type :&:
is used to denote this stacking, so that (IdentityT :&: ReaderT Int) IO = CompositeInterpreter
.
> infixr 8 :&: > newtype ((m :: (* -> *) -> * -> *) :&: (n :: (* -> *) -> * -> *)) (s :: * -> *) t = Trans { > runTrans :: m (n s) t > } deriving (Functor, Applicative, Monad, MonadIO)
If we rephrase the interpreters for Log
and Plus
as transformers, then we can write the type of the Interpreter
which can handle their sum.
> newtype ModularPlusInterpreter m t = MPI { > runMPI :: ReaderT Int m t > } deriving (Functor, Applicative, Monad, MonadIO, MonadTrans) > > newtype StdoutLogInterpreter m t = SLI { > runSLI :: IdentityT m t > } deriving (Functor, Applicative, Monad, MonadIO, MonadTrans) > > modularPlus :: Monad m => Interpreter (PlusF) (ModularPlusInterpreter m) > modularPlus term = case term of > Plus left right next -> do > modulus <- MPI ask > next ((left + right) `mod` modulus) > > stdoutLog :: MonadIO m => Interpreter (LogF String) (StdoutLogInterpreter m) > stdoutLog term = case term of > Log str next -> do > SLI (liftIO (putStrLn str)) > next > > stdoutLogAndModularPlus > :: Interpreter > (LogF String :+: PlusF) > ((StdoutLogInterpreter :&: ModularPlusInterpreter) IO)
A value for that last type isn’t obvious. It ought to be automatic, somehow inferred by GHC given the two smaller Interpreter
s. This calls for a typeclass which determines an Interpreter
. This class FInterpreter m b f
can be read as “m interprets f in base monad b”.
> class Functor f => FInterpreter (m :: (* -> *) -> * -> *) (b :: * -> *) (f :: * -> *) where > finterpret :: Interpreter f (m b)
Now we need to give an instance for :&:
. We want m :&: n
to interpret a functor sum f :+: g
over some base b
, but of course we have to qualify this with some assumptions:
n
interpretsg
overb
m
interpretsf
overn b
m
,n
give monads when stacked overb
m
can be “stripped off” and then reset, as described by the classFTrans
.
That last assumption probably means that not all monad transformers can be used as interpreters, but it’s nothing to worry about, because IdentityT
and ReaderT r
fulfill it, and these are just the monads we use for this demonstration.
> instance > ( FInterpreter n b g > , FInterpreter m (n b) f > , Monad b > , Monad (n b) > , Monad (m (n b)) > , FTrans m > ) => FInterpreter (m :&: n) b (f :+: g) > where > finterpret term = case term of > SumL left -> do > let term' = fmap runTrans left > Trans (finterpret term') > SumR right -> do > -- This is where FTrans is needed. > -- We can strip off m inside the functor g, interpret using > -- FInterpreter n b g, then return to m :&: n using inFTrans. > outR <- outFTrans > let term' = fmap outR right > inFTrans (finterpret term') > > class FTrans (m :: (* -> *) -> * -> *) where > outFTrans :: (Monad (m (n b)), Monad (n b)) => (m :&: n) b ((m :&: n) b t -> n b t) > inFTrans :: (Monad (m (n b))) => n b t -> (m :&: n) b t > > instance FTrans IdentityT where > outFTrans = Trans (IdentityT (return (runIdentityT . runTrans))) > inFTrans = Trans . IdentityT > > instance FTrans (ReaderT r) where > outFTrans = Trans (ReaderT (\r -> return (\x -> runReaderT (runTrans x) r))) > inFTrans = Trans . ReaderT . const
With FInterpreter
and FTrans
instances for our interpreters, we have enough machinery to give a value for stdoutLogAndModularPlus
.
> instance Monad m => FInterpreter ModularPlusInterpreter m PlusF where > finterpret = modularPlus > > instance MonadIO m => FInterpreter StdoutLogInterpreter m (LogF String) where > finterpret = stdoutLog > > instance FTrans (ModularPlusInterpreter) where > outFTrans = Trans (MPI (ReaderT (\r -> return (\x -> runReaderT (runMPI (runTrans x)) r)))) > inFTrans = Trans . MPI . ReaderT . const > > instance FTrans (StdoutLogInterpreter) where > outFTrans = Trans (SLI (IdentityT (return (runIdentityT . runSLI . runTrans)))) > inFTrans = Trans . SLI . IdentityT > > stdoutLogAndModularPlus = finterpret
Actually, we don’t need FTrans ModularPlusInterpreter
, because it sits below (on the right of) StdoutLogInterpreter
in the composite interpreter, but it’s good to have anyway.
We can now interpret terms of Free (LogF :+: PlusF)
. This demands running every transformer in the stack, in-order, and stripping of the :&:
constructors via runTrans
, like so:
> step1 :: (:&:) StdoutLogInterpreter ModularPlusInterpreter IO Int > step1 = iterM stdoutLogAndModularPlus (logPlusTerm 5) > > step2 :: ModularPlusInterpreter IO Int > step2 = runIdentityT . runSLI . runTrans $ step1 > > step3 :: Int -> IO Int > step3 i = (flip runReaderT) i . runMPI $ step2
And that’s it! Modular arithmetic with logging to stdout, arising from its parts.
I’m excited about this technique because it may open up a kind of normal form for programs, in which all business logic is expressed by DSLs defined by functors, and these DSLs are mixed via functor sums to give the program’s principal DSL, which is then interpreted by choosing interpreters for each part.
Parsing with more proofs
Posted: 21/01/2015 Filed under: Canadian Geography, Dependent Types, Idris, Programming Leave a commentEarlier this month I showed a string parser implementation with machine-checked proof that the remaining unparsed string is always a tail of the input string. It’s good to know that parsers of this type are well behaved, and it will probably help you sleep at night, but defining a parser for a particular application reveals still unchecked properties.
Suppose for example that we have some datatype that we wish to serialize and deserialize. This comes up often enough in the real world, but instead of a real world, practical example, I’ll give a lesson in Canadian geography:
data Province
= NewfoundlandAndLabrador
| NovaScotia
| PrinceEdwardIsland
| NewBrunswick
| Quebec
| Ontario
| Manitoba
| Saskatchewan
| Alberta
| BritishColumbia
showProvince : Province -> List Char
showProvince NewfoundlandAndLabrador = unpack "Nfld"
showProvince NovaScotia = unpack "NS"
showProvince PrinceEdwardIsland = unpack "PEI"
showProvince NewBrunswick = unpack "NB"
showProvince Quebec = unpack "Que"
showProvince Ontario = unpack "Ont"
showProvince Manitoba = unpack "Man"
showProvince Saskatchewan = unpack "Sas"
showProvince Alberta = unpack "Alb"
showProvince BritishColumbia = unpack "BC"
Defined above is a type which represents all ten of Canada’s provinces, and a function which embeds them in the List Char
type. The unpack
function is used to convert from Idris’s string type, because my string parser actually expects a List Char
, not a string.
With showProvince
, we’re able to present a user with an abbreviated yet human-readable name for each province. To allow a user to present a province to the computer, we can take string input through a parser!
parseNewfoundlandAndLabrador : StringParser Province
parseNewfoundlandAndLabrador = string "Nfld" $> pure NewfoundlandAndLabrador
parseNovaScotia : StringParser Province
parseNovaScotia = string "NS" $> pure NovaScotia
parsePrinceEdwardIsland : StringParser Province
parsePrinceEdwardIsland = string "PEI" $> pure PrinceEdwardIsland
parseNewBrunswick : StringParser Province
parseNewBrunswick = string "NB" $> pure NewBrunswick
parseQuebec : StringParser Province
parseQuebec = string "Que" $> pure Quebec
parseOntario : StringParser Province
parseOntario = string "Ont" $> pure Ontario
parseManitoba : StringParser Province
parseManitoba = string "Man" $> pure Manitoba
parseSaskatchewan : StringParser Province
parseSaskatchewan = string "Sas" $> pure Saskatchewan
parseAlberta : StringParser Province
parseAlberta = string "Alb" $> pure Alberta
parseBritishColumbia : StringParser Province
parseBritishColumbia = string "BC" $> pure BritishColumbia
parseProvince : StringParser Province
parseProvince =
parseNewfoundlandAndLabrador
<|> parseNovaScotia
<|> parsePrinceEdwardIsland
<|> parseNewBrunswick
<|> parseQuebec
<|> parseOntario
<|> parseManitoba
<|> parseSaskatchewan
<|> parseAlberta
<|> parseBritishColumbia
Some definitions in the last snippet may look unfamiliar. They’re from the repository linked in the previous post, and Idris’s applicative functor module.
With parseProvince
in place, we have in hand our serializer and deserializer combo:
serialize : Province -> List Char
serialize = showProvince
-- The parser is parseProvince <$ endOfInput because a deserializer
-- must consume the entire input string.
deserialize : List Char -> Maybe Province
deserialize cs = parsedValue (runStringParser (parseProvince <$ endOfInput) cs)
but this is not enough! Sure, we call these functions by very suggestive names, indicating that they lie in some useful relationship, but we haven’t actually checked that they in fact merit their names. If a serializer/deserializer pair is to be of any use, they must be in some sense compatible: we expect deserialize to be left-inverse to serialize, through the Maybe. More formally, we want:
forall p : Province . deserialize (serialize p) = Just p
In this example case of 10 Canadian province abbreviations, it’s not hard to convince yourself that this equality holds; in real world examples with more complicated datatypes, it may be more difficult. Fortunately for us, Idris can help! Here’s the proposition and proof:
leftInverse : (p : Province) -> (deserialize (serialize p) = Just p)
leftInverse NewfoundlandAndLabrador = Refl
leftInverse NovaScotia = Refl
leftInverse PrinceEdwardIsland = Refl
leftInverse NewBrunswick = Refl
leftInverse Quebec = Refl
leftInverse Ontario = Refl
leftInverse Manitoba = Refl
leftInverse Saskatchewan = Refl
leftInverse Alberta = Refl
leftInverse BritishColumbia = Refl
By writing Refl
as the value for each case, we assert that the left- and right-hand sides of the =
in the type will in fact be syntactically the same, given the particular Province
matched on the left-hand side of each case. When we hand this program off to Idris, it checks that this is actually true. So, if Idris certifies this program well-typed, then it confirms just what we want to know: deserialize is left-inverse to serialize.
Finally, we can package up our combo in a datatype:
data Serializable : a -> Type where
MkSerializable
: (serialize : a -> List Char)
-> (deserialize : List Char -> Maybe a)
-> (leftInverse : ((x : a) -> deserialize (serialize x) = Just x))
-> Serializable a
so that whenever we have a Serializable a
in hand, we know that we can serialize and deserialize values of type a
in a well-behaved way. If we could not, then that value wouldn’t be well-typed! We can show at least one inhabitant of Serializable a
:
provinceSerializable : Serializable Province
provinceSerializable = MkSerializable serialize deserialize leftInverse
Find the full source file here and run it for yourself! But start the process before you leave to have your lunch, because it takes about 20 minutes. However, if you change the serializer or deserializer so that they’re clearly not compatible, Idris very quickly discovers this and quits. I wonder what’s going on under the hood.
Of course, we could also check this left-inverse property via judiciously designed tests, but the formal method shown here has at least two advantages over testing:
- It is tightly integrated with the build: if the property doesn’t hold, the program doesn’t compile. With tests, it’s possible to obtain a build for which the property doesn’t hold.
- Tests can go stale, types cannot. If we change our datatype, we can get away with not updating our tests, but we can’t bypass the proof obligation built-in to the program here.
I’m excited to see methods like this applied in industry in the years to come!