Right now dynamic languages are popular in the scripting world, to the dismay of people who prefer statically typed languages for ease of maintenance.
Fortunately, Haskell is an excellent candidate for statically typed scripting for a few reasons:
- Haskell has lightweight syntax and very little boilerplate
- Haskell has global type inference, so all type annotations are optional
- You can type-check and interpret Haskell scripts very rapidly
- Haskell's function application syntax greatly resembles Bash
However, Haskell has had a poor "out-of-the-box" experience for a while, mainly due to:
- Poor default types in the Prelude (specifically
String
andFilePath
) - Useful scripting utilities being spread over a large number of libraries
- Insufficient polish or attention to user experience (in my subjective opinion)
To solve this, I'm releasing the turtle library, which provides a slick and comprehensive interface for writing shell-like scripts in Haskell. I've also written a beginner-friendly tutorial targeted at people who don't know any Haskell.
Overview
turtle
is a reimplementation of the Unix command line environment in Haskell. The best way to explain this is to show what a simple "turtle
script" looks like:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = do
cd "/tmp"
mkdir "test"
output "test/foo" "Hello, world!" -- Write "Hello, world!" to "test/foo"
stdout (input "test/foo") -- Stream "test/foo" to stdout
rm "test/foo"
rmdir "test"
sleep 1
die "Urk!"
If you make the above file executable, you can then run the program directly as a script:
$ chmod u+x example.hs
$ ./example.hs
Hello, world!
example.hs: user error (Urk!)
The turtle
library renames a lot of existing Haskell utilities to match their Unix counterparts and places them under one import. This lets you reuse your shell scripting knowledge to get up and going quickly.
Shell compatibility
You can easily invoke an external process or shell command using proc
or shell
:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = do
mkdir "test"
output "test/file.txt" "Hello!"
proc "tar" ["czf", "test.tar.gz", "test"] empty
-- or: shell "tar czf test.tar.gz test" empty
Even people unfamiliar with Haskell will probably understand what the above program does.
Portability
"turtle
scripts" run on Windows, OS X and Linux. You can either compile scripts as native executables or interpret the scripts if you have the Haskell compiler installed.
Streaming
You can build or consume streaming sources. For example, here's how you print all descendants of the /usr/lib
directory in constant memory:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = view (lstree "/usr/lib")
... and here's how you count the number of descendants:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import qualified Control.Foldl as Fold
import Turtle
main = do
n <- fold (lstree "/usr/lib") Fold.length
print n
... and here's how you count the number of lines in all descendant files:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import qualified Control.Foldl as Fold
import Turtle
descendantLines = do
file <- lstree "/usr/lib"
True <- liftIO (testfile file)
input file
main = do
n <- fold descendantLines Fold.length
print n
Exception Safety
turtle
ensures that all acquired resources are safely released in the face of exceptions. For example, if you acquire a temporary directory or file, turtle
will ensure that it's safely deleted afterwards:
example = do
dir <- using (mktempdir "/tmp" "test")
liftIO (die "The temporary directory will still be deleted!")
However, exception safety comes at a price. turtle
forces you to consume all streams in their entirety so you can't lazily consume just the initial portion of a stream. This was a tradeoff I chose to keep the API as simple as possible.
Patterns
turtle
supports Pattern
s, which are like improved regular expressions. Use Pattern
s as lightweight parsers to extract typed values from unstructured text:
$ ghci
>>> :set -XOverloadedStrings
>>> import Turtle
>>> data Pet = Cat | Dog deriving (Show)
>>> let pet = ("cat" *> return Cat) <|> ("dog" *> return Dog) :: Pattern Pet
>>> match pet "dog"
>>> [Dog]
>>> match (pet `sepBy` ",") "cat,dog,cat"
[[Cat,Dog,Cat]]
You can also use Pattern
s as arguments to commands like sed
, grep
, find
and they do the right thing:
>>> stdout (grep (prefix "c") "cat") -- grep '^c'
cat
>>> stdout (grep (has ("c" <|> "d")) "dog") -- grep 'cat\|dog'
dog
>>> stdout (sed (digit *> return "!") "ABC123") -- sed 's/[[:digit:]]/!/g'
ABC!!!
Unlike many Haskell parsers, Pattern
s are fully backtracking, no exceptions.
Formatting
turtle
supports typed printf
-style string formatting:
>>> format ("I take "%d%" "%s%" arguments") 2 "typed"
"I take 2 typed arguments"
turtle
even infers the number and types of arguments from the format string:
>>> :type format ("I take "%d%" "%s%" arguments")
format ("I take "%d%" "%s%" arguments") :: Text -> Int -> Text
This uses a simplified version of the Format
type from the formatting library. Credit to Chris Done for the great idea.
The reason I didn't reuse the formatting
library was that I spent a lot of effort keeping the types as simple as possible to improve error messages and inferred types.
Learn more
turtle
doesn't try to ambitiously reinvent shell scripting. Instead, turtle
just strives to be a "better Bash". Embedding shell scripts in Haskell gives you the the benefits of easy refactoring and basic sanity checking for your scripts.
You can find the turtle
library on Hackage or Github. Also, turtle
provides an extensive beginner-friendly tutorial targeted at people who don't know any Haskell at all.