Right now dynamic languages are popular in the scripting world, to the dismay of people who prefer statically typed languages for ease of maintenance.
Fortunately, Haskell is an excellent candidate for statically typed scripting for a few reasons:
- Haskell has lightweight syntax and very little boilerplate
- Haskell has global type inference, so all type annotations are optional
- You can type-check and interpret Haskell scripts very rapidly
- Haskell's function application syntax greatly resembles Bash
However, Haskell has had a poor "out-of-the-box" experience for a while, mainly due to:
- Poor default types in the Prelude (specifically
StringandFilePath) - Useful scripting utilities being spread over a large number of libraries
- Insufficient polish or attention to user experience (in my subjective opinion)
To solve this, I'm releasing the turtle library, which provides a slick and comprehensive interface for writing shell-like scripts in Haskell. I've also written a beginner-friendly tutorial targeted at people who don't know any Haskell.
Overview
turtle is a reimplementation of the Unix command line environment in Haskell. The best way to explain this is to show what a simple "turtle script" looks like:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = do
cd "/tmp"
mkdir "test"
output "test/foo" "Hello, world!" -- Write "Hello, world!" to "test/foo"
stdout (input "test/foo") -- Stream "test/foo" to stdout
rm "test/foo"
rmdir "test"
sleep 1
die "Urk!"
If you make the above file executable, you can then run the program directly as a script:
$ chmod u+x example.hs
$ ./example.hs
Hello, world!
example.hs: user error (Urk!)
The turtle library renames a lot of existing Haskell utilities to match their Unix counterparts and places them under one import. This lets you reuse your shell scripting knowledge to get up and going quickly.
Shell compatibility
You can easily invoke an external process or shell command using proc or shell:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = do
mkdir "test"
output "test/file.txt" "Hello!"
proc "tar" ["czf", "test.tar.gz", "test"] empty
-- or: shell "tar czf test.tar.gz test" empty
Even people unfamiliar with Haskell will probably understand what the above program does.
Portability
"turtle scripts" run on Windows, OS X and Linux. You can either compile scripts as native executables or interpret the scripts if you have the Haskell compiler installed.
Streaming
You can build or consume streaming sources. For example, here's how you print all descendants of the /usr/lib directory in constant memory:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = view (lstree "/usr/lib")
... and here's how you count the number of descendants:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import qualified Control.Foldl as Fold
import Turtle
main = do
n <- fold (lstree "/usr/lib") Fold.length
print n
... and here's how you count the number of lines in all descendant files:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import qualified Control.Foldl as Fold
import Turtle
descendantLines = do
file <- lstree "/usr/lib"
True <- liftIO (testfile file)
input file
main = do
n <- fold descendantLines Fold.length
print n
Exception Safety
turtle ensures that all acquired resources are safely released in the face of exceptions. For example, if you acquire a temporary directory or file, turtle will ensure that it's safely deleted afterwards:
example = do
dir <- using (mktempdir "/tmp" "test")
liftIO (die "The temporary directory will still be deleted!")
However, exception safety comes at a price. turtle forces you to consume all streams in their entirety so you can't lazily consume just the initial portion of a stream. This was a tradeoff I chose to keep the API as simple as possible.
Patterns
turtle supports Patterns, which are like improved regular expressions. Use Patterns as lightweight parsers to extract typed values from unstructured text:
$ ghci
>>> :set -XOverloadedStrings
>>> import Turtle
>>> data Pet = Cat | Dog deriving (Show)
>>> let pet = ("cat" *> return Cat) <|> ("dog" *> return Dog) :: Pattern Pet
>>> match pet "dog"
>>> [Dog]
>>> match (pet `sepBy` ",") "cat,dog,cat"
[[Cat,Dog,Cat]]
You can also use Patterns as arguments to commands like sed, grep, find and they do the right thing:
>>> stdout (grep (prefix "c") "cat") -- grep '^c'
cat
>>> stdout (grep (has ("c" <|> "d")) "dog") -- grep 'cat\|dog'
dog
>>> stdout (sed (digit *> return "!") "ABC123") -- sed 's/[[:digit:]]/!/g'
ABC!!!
Unlike many Haskell parsers, Patterns are fully backtracking, no exceptions.
Formatting
turtle supports typed printf-style string formatting:
>>> format ("I take "%d%" "%s%" arguments") 2 "typed"
"I take 2 typed arguments"
turtle even infers the number and types of arguments from the format string:
>>> :type format ("I take "%d%" "%s%" arguments")
format ("I take "%d%" "%s%" arguments") :: Text -> Int -> Text
This uses a simplified version of the Format type from the formatting library. Credit to Chris Done for the great idea.
The reason I didn't reuse the formatting library was that I spent a lot of effort keeping the types as simple as possible to improve error messages and inferred types.
Learn more
turtle doesn't try to ambitiously reinvent shell scripting. Instead, turtle just strives to be a "better Bash". Embedding shell scripts in Haskell gives you the the benefits of easy refactoring and basic sanity checking for your scripts.
You can find the turtle library on Hackage or Github. Also, turtle provides an extensive beginner-friendly tutorial targeted at people who don't know any Haskell at all.