Monday, July 4, 2016

Auto-generate service API endpoints from records

Haskell has pretty cool support for code generation from data type definitions using GHC generics. So I thought: "why not generate a service from a data type?".

The basic idea is pretty simple. Given a data type definition like this:

data Command
    = Create { filepath :: FilePath, contents :: String }
    | Delete { filepath :: FilePath }

... we'll auto-generate two API endpoints:

  • /create?filepath=:string&contents=:string
  • /delete?filepath=:string

Each endpoint accepts query parameters matching the fields for their respective constructors:

$ curl 'localhost:8080/create?filepath=test.txt&contents=ABC'
"File created"
$ cat test.txt
ABC
$ curl 'localhost:8080/delete?filepath=test.txt'
"File deleted"
$ cat test.txt
cat: test.txt: No such file or directory

The complete code to build the server looks like this:

{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE DeriveGeneric  #-}

import Server.Generic
import System.Directory (removeFile)

data Command
    = Create { filepath :: FilePath, contents :: String }
    | Delete { filepath :: FilePath }
    deriving (Generic, ParseRecord)

handler :: Command -> IO String
handler (Create file text) = do
    writeFile file text
    return "File created"
handler (Delete file) = do
    removeFile file
    return "File deleted"

main :: IO ()
main = serveJSON 8080 handler

You can test it yourself by running:

$ stack build server-generic
$ stack runghc AboveExample.hs

... and then in a separate terminal you can hit each endpoint with curl as illustrated above.

GHC Generics

The Haskell magic is in this one line of code:

    deriving (Generic, ParseRecord)

This auto-generates code that tells the server how to marshal the route and query parameters into our Command data type. All we have to do is supply a handler that pattern matches on the incoming Command to decide what to do:

handler :: Command -> IO String
handler (Create file text) = do
    writeFile file text
    return "File created"
handler (Delete file) = do
    removeFile file
    return "File deleted"

You can read that as saying:

  • "If a client hits the /create endpoint, create the specified file"
  • "If a client hits the /delete endpoint, delete the specified file"

As an exercise, you can try modifying the handler to respond with the name of the file that was created or deleted.

However, you're not limited to query parameters. You can also parse data from path tokens by just omitting field labels from the data type:

{-# LANGUAGE DeriveGeneric #-}

import Server.Generic

data Command
    = Add      Double Double
    | Multiply Double Double
    deriving (Generic)

instance ParseRecord Command

handler :: Command -> IO Double
handler (Add      x y) = return (x + y)
handler (Multiply x y) = return (x * y)

main :: IO ()
main = serveJSON 8080 handler

If you run the above code, you get a server that has two endpoints:

  • /add/:double/:double
  • /multiply/:double/:double

You can run the server and test that they work:

$ curl 'localhost:8080/add/2/3'
5
$ curl 'localhost:8080/multiply/2/3'
6

This library also intelligently handles optional and repeated fields in data type definitions. For example, suppose we serve this data type::

{-# LANGUAGE DeriveGeneric  #-}
{-# LANGUAGE DeriveAnyClass #-}

import Server.Generic

data Example = Example
    { list     :: [Int]
    , optional :: Maybe   Int
    , first    :: First   Int
    , last     :: Last    Int
    } deriving (Generic, ParseRecord, ToJSON)

handler :: Example -> IO Example
handler = return  -- Serve decoded value back to client as JSON

main :: IO ()
main = serveJSON 8080 handler

... then the server will echo back the decoded type as JSON, correctly handling absent or repeated fields:

$ curl 'localhost:8080/example'
{"list":[],"first":null,"last":null,"optional":null}
$ curl 'localhost:8080/example?optional=1&list=1&list=2&first=1&first=2&last=1&last=2'
{"list":[1,2],"first":1,"last":2,"optional":1}

Also, these repeated and optional annotations work for path components, too, in case you were wondering:

{-# LANGUAGE DeriveGeneric  #-}
{-# LANGUAGE DeriveAnyClass #-}

import Server.Generic

data Example = Example [Int] (Maybe Text)
    deriving (Generic, ParseRecord, ToJSON)

handler :: Example -> IO Example
handler = return

main :: IO ()
main = serveJSON 8080 handler

The above server does "the right thing" and doesn't need to be told where the Ints end and the Text begins:

$ curl 'localhost:8080/example'
[[],null]
$ curl 'localhost:8080/example/1/2/foo'
[[1,2],"foo"]
$ curl 'localhost:8080/example/1/2/3'
[[1,2,3],null]
$ curl 'localhost:8080/example/foo'
[[],"foo"]

The server uses backtracking when parsing the route so the server knows when the Ints end and the Text begins.

Types

The whole thing is strongly typed, which means several things in the context of service programming.

For example, if you define a data type that expects an Int field, then by golly your handler will get an Int field or the server will automatically reject the request for you. You don't have to worry about checking that the field is present nor do you need to validate that the parameter decodes to an Int correctly. If you want the parameter to be optional then you need to make that explicit by marking the field as type Maybe Int.

You also don't have to handle fields that belong to other endpoints. Each endpoint only gets exactly the fields it requested; no more, no less. If a given endpoint gets the wrong set of path tokens or query parameters then the server rejects the request for you.

This is also strongly typed in the sense that more logic is pushed into the type and less logic goes in the handler. If you want just the first or last occurrence of a query parameter, you just annotate the type with First or Last, respectively. The more logic you push into declarative type-level programming the more you distill your handler to focus on business logic.

Caveats

I wrote this library to provide a quick an easy way to spin up Haskell web services but the library could still use some improvement. I'm not really a web developer so I only kind of know what I'm doing and could use help from people more knowledgeable than me.

The most notable deficiency is that the library does not take care to serve proper HTTP status codes for different types of errors. Every failed request returns a 404 status code.

Also, if the route is malformed the error message is a completely unhelpful "404 Not Found" error message that doesn't indicate how to fix the error.

Another blatant deficiency is that the server completely ignores the request method. I wasn't sure how to design this to work within the framework of data type generic programming.

If you have ideas about how to improve things I would greatly welcome any contributions.

Conclusions

People familiar with Haskell will recognize that this library resembles the servant library in some respects. The high-level difference is that this is a subset of servant that is much simpler but also significantly less featureful. For example, servant can also generate client-side bindings and Swagger resource declarations and servant also permits a much greater degree of customization.

This library focuses primarily on simple quick-and-dirty services; for anything more polished and production-ready you will probably want to try other Haskell service libraries. I just wanted to make it as easy as possible for people to get started with back-end development in Haskell and also show off how cool and powerful GHC generics can be.

If you would like to learn more about this library you can read the tutorial or if you would like to use the library you can obtain the code from Hackage or Github.

Sunday, July 3, 2016

list-transformer - A beginner-friendly ListT

Currently, Hackage has four implementations of "ListT-done-right" that I'm aware of:

  • LogicT
  • pipes (which provides a ListT type)
  • list-t
  • List

However, I felt that all of these libraries were more complex than they needed to be so I tried to distill them down to the simplest library possible. I want to encourage more people to use ListT so I'm releasing the beginner-friendly list-transformer library .

There are a few design choices I made to improve the new user experience for this library:

First, the ListT data type is not abstract and users are encouraged to use constructors to both assemble and pattern match on ListT values. This helps them build an intuition for how ListT works under the hood since the type is small and not difficult to use directly:

newtype ListT m a = ListT { next :: m (Step m a) }

data Step m a = Cons a (ListT m a) | Nil

Second, the API is tiny in order to steer users towards leaning on Haskell's "collections API" as much as possible. Specifically, I try to direct people towards these type classes:

  • Functor/Applicative/Monad
  • Alternative/MonadPlus
  • MonadTrans/MonadIO

Right now there are only three functions in the API that are not class methods:

  • runListT
  • fold
  • foldM

Everything else is a method for one of the standard type classes and I avoid introducing new type classes.

Third, the API does not provide convenient helper functions for fully materializing the list. In other words, there is no utility function of this type:

toList :: ListT m a -> m [a]

A determined user can still force the list either indirectly via the fold function or by manually recursing over the ListT type. The goal is not to forbid this behavior, but rather to gently encourage people to preserve streaming. The API promotes the intuition that you're supposed to transform and consume results one-by-one instead of in large batches.

Fourth, the library comes with a long inline tutorial which is longer than the actual code. I think the tutorial could still use some improvement so if you would like to contribute to improve the tutorial please do!

Conclusion

You can find the list-transformer library either on Hackage or on Github.

Hopefully this encourages more people to give ListT a try and provides a stepping stone for understanding more complex streaming abstractoins.