Saturday, October 12, 2013

Haskell type graphs with Uniplate and Haskell-src-exts

Summary: Generating graphs from Haskell modules is easy, using Haskell-src-exts and Uniplate.

In my recent Uniplate talk I showed a graph of which Haskell-src-exts types contain Exp at any level. Several people in the audience asked for the code, so I've tidied it up and will walk through it in this post. Before we start, the graph we want to generate is:


Each node is a type, and an edge from a to b means at least one constructor of type a contains a value of type b. I only include types which eventually reach Exp. To calculate this graph, I used Haskell-src-exts and Uniplate. The main function is:

import Language.Haskell.Exts          -- haskell-src-exts
import Data.Generics.Uniplate.Data    -- uniplate
import Data.List

main :: IO ()
main = writeFile "graph.dot" . graph . interesting "Exp" . reach =<< getModule

We can read main from right to left:

  • We first call getModule to get the module we are interested in.
  • Next we call reach to compute which types are directly contained within which types.
  • We use interesting to pick out only those types which can reach Exp.
  • We convert to a DOT file using graph.
  • And finally we write the file out to Graph.dot.
  • Afterwards we can manually run dot -Tpng graph.dot > graph.png to generate the image above.

Now let's go through each of the functions in turn.

The getModule function

getModule :: IO Module
getModule = do
    src <- readFile "../../haskell-src-exts/src/Language/Haskell/Exts/Syntax.hs"
    return $ fromParseResult $ parseModule $ unlines $ filter (not . bad) $ lines src
    where bad x = head (words x ++ [""]) `elem` words "#if #else #endif deriving #ifdef"

We read the module from a known location on disk and parse it. The only complication is that the Haskell-src-exts AST module uses the C pre processor (CPP). By filtering out all lines beginning with CPP directives, and also deriving, we end up with a module with the same type structure, but which can be parsed with Haskell-src-exts. An alternative approach would be to use something like cpphs to properly interpret the CPP directives.

The reach function

reach :: Module -> [(String, [String])]
reach m = [ (prettyPrint name, nub [prettyPrint x | TyCon x <- universeBi ctors])
          | DataDecl _ _ _ name _ ctors _ <- universeBi m]

This function gets a list of type name and contained types for all data declarations in the module. We use the universeBi function from Uniplate to find everything of the relevant type, a list comprehension pattern to filter it down and grab the parts we want, and then prettyPrint to convert things to String. This function is the only one that uses Uniplate.

The interesting function

interesting :: String -> [(String, [String])] -> [(String, [String])]
interesting target xs = [(a,b `intersect` keep) | (a,b) <- xs, a `elem` keep]
    where keep = f [target] xs
          f want xs = if null new then want else f (map fst new ++ want) rest
              where (new,rest) = partition (not . null . intersect want . snd) xs

This function removes all types which don't at some point include the target, which is Exp in our case. We first compute keep, which is the interesting types, then restrict the input to only refer to things in keep. To compute keep we start by using the target as the thing we want. We then repeatedly find any types that include anything we want. Once there is nothing new that we want, we finish. This function is the only one that performs thoughtful computation, the others are just shuffling data between formats.

The graph function

graph :: [(String, [String])] -> String
graph xs = unlines $ ["digraph g {"] ++ [from ++ " -> " ++ t ++ ";" | (from,to) <- xs, t <- to] ++ ["}"]

This function converts the types into GraphViz format, which is a list of edges in a directed graph.

License

This code is throwaway code I wrote to generate one slide, and didn't even bother checking into a version control system. If it's useful to you, please use it for whatever purpose you want. Ignoring blank lines, optional type signatures and imports, it only takes 12 lines.

Wednesday, October 02, 2013

Upcoming talks

I'll be talking at two events this month - hope to see some people there!

9 Oct 2013 - Haskell eXchange 2013

Everyone should use a Generics library - writing HLint with Uniplate

A generics library allows programmers to express only the interesting part of certain tasks, avoiding lots of boilerplate and making the code significantly shorter and more maintainable. Haskell is awash with excellent generics libraries, including Scrap Your Boilerplate, Generics for the Masses, Generic Deriving and Uniplate, yet many developers will never have used any of them. This talk explains how to use the Uniplate generics library, using examples derived from the Haskell Lint tool (HLint), which makes extensive use of Uniplate.

23 Oct 2013 - Runcifest

Colin's Industrial Influence (joint with Malcolm Wallace)

Colin has dabbled in many areas of functional programming, from XML processing to data visualisation, from type searching to generic transformations. While most of Colin's contributions have been made from inside a University, their impact has been felt in industry. In this talk we take a whirlwind tour through some of the projects that have influenced us and our work.