Programming in Clojure. Part 2: Functional Programming
What Will I Learn?
In this part of the tutorial you will learn:
- What are the benefits of functional programming.
- How does Clojure incorporate functional programming paradigm into it's design.
- How to build solutions for complex problems by combining simple functions like
To follow this part of the tutorial it might be benefitial to have:
- Experience in any other programming languages, especially ones designed for fuctional programming. Ideally Lisp dialects.
- Experience with functional programming
- Good mood and desire to learn
This part of the tutorial will tech you core principles of functional programming and how to apply them for writing Clojure application. We will develop a short program that models a very basic spam-filter.
- Part 1: Why Clojure?
- Part 2: Functional Programming
- Part 3: Syntax and REPL
- Part 4: Data structures
- Part 5: Advanced data structures
- And maybe more...
Object Oriented vs. Functional Programming
Most modern programming languages are designed to be used with Object Oriented Programming (OOP) as a programming paradigm. OOP ecourages you to divide your code into loosely-coupled objects: bundles of data and functions that operate on it. Objects are mutable by design. When object method is called it may change the state of an object, so two equal objectes might become different over time. Writing OOP program essentially consists of describing a sequence of object transformations.
OOP is the most widely used programming paradigm, because of it's power and simplicity. OOP provides many options for defining convinient abstractions, decoupling code into smaller chunks, and is overall very flexible. Yet, it's not perfrect, since it also has many inherent weaknesses, such as:
- Mutable objects are very hard to track. It often happens that bug in the code creates an incoherent internal state, which leads to an error much later in execution time. Because an error is delayed, such bugs can be very tricky to find and fix.
- Mutability does not work well with concurrency and parallellism. The state for each shared object needs to be synchrozed between all threads that want to modify it.
- The idea that everything can be modelled as an object can be seen as somewhat naive. For example, how would one effectively translate the concept of time to an object? Rich Hickey, creator of Clojure language, talks at leangth about issues like this in his 2009 JVM Languages Summit keynote Are We There Yet?.
Functional Programming takes diffrent approach. Key attributes of Functional Programming are follwing:
- Variables are immutable. Once the values is assigned to a variable it can not be changed. If value must be modified in some way, new value is created instead. This is obviously often handy for application that use parallellism.
- Functions suppose to be pure: they should not cause side-effects (mutations) and do not rely on any mutable state to calculate return value. As a direct result of those properties, pure function always returns exactly the same result given the same arguments. This property matches definition of a function in methematics: function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.
- Data and functions are treated equally. Variables can be assigned functions as a value, functions can take other functions as an arguments, as well as pass functions as return value.
- Program written in functional style are declarative rather than instructive. Instead of describing the sequence of action to take to reach a goal, programmer describes conditions and relations between different abstractions, pretty much how mathematitian describes relations between different entities to prove a theorem.
It is important to note that above points are purely theoretical. In practice, every functional programming language must handle the problem of working with stateful entities one way or another, because real world is a stateful environment. For example OS channels like stdin, stdout and stderr are inherently mutable. Obviously, every useful program need to have some way to retun an output. However, it is possible to seperate parts of code that work with necessarily mutable environments from those parts that can be written in pure functional style. If, for example, your task is to write a complicated application which will take some input data, calcualte some statistics on it and stores it into a database, it is a good idea to write the statistic computation part as a purely functional code and then plug it in the non-pure function that handles reading input data and saving output to a database.
To apply principles of functional programming on practice, let's imagine you are working on a simple spam filter for your emails. This spam filter will read through your email and remove those of them that contain the word "ICO", and print the rest to standard output, one message per line.
The name of the project is
laxam-clojure-tutorial-part2-spam-filter. As we did in part one of this tutorial, create a new project with Leiningen command
lein new app laxam-clojure-tutorial-part2-spam-filter and (optionally) update
For the purposes of this tutorial, incomming messages are provided as a Clojure list. In real application you would probably fetch this list from database or received it through API.
(def messages '("Hi, it's me, Ronald. Let's go bowling?" "@richhickey approved your Pull Request." "New promising startup works on a perpetual motion machine. Participate in ICO now!" "Hi, it's Greg, are you up for a beer tonight?"))
def binds name
messages to the value of the second argument, which is a list of messages in our case. We can define a list of items in clojure by prepending parenthesis with quite (
') symbol. If we put this code in the top level of our namespace, we can address it from any function in that namespace.
We will also bind our spam-word to a corresponding name, so that it's easier to change it to something else in future if needed:
(def spam-word "ICO")
Next, we need to write a function that determines whether one particular message is considered spam or not. To do it, first we will split a message into a list of words, and then check whether one of the words is ICO. Before checking for presense of a spam-word we will convert a list of words into a set of words, because it's easier to perform presense check on a set.
To split a string into words, accounting for special characters, consecutive whitespaces and punctuation characters we will use split function from
split function takes a target string as a first argument and a regular expression as a second one. Regular expressions in Clojure look like strings with # symbol prepended.
clojure.string namespace we change the namespace declaration on top of the file to look like this:
(ns laxam-clojure-tutorial-part2-spam-filter.core (:require [clojure.string :as str]) (:gen-class))
Now we are ready to write the function out:
(defn spam? "Takes one message as an argument and returns true if it is spam, otherwise false" [message] (contains? (set (str/split message #"\W+")) spam-word))
#"\W+" bit is a regular expression that denotes a word boundry, so
(str/split message #"\W+") will split the message into words. In general, this function reads quite naturally like this:
Now we just need to apply this function to the list of messages and remove those that return true. Clojure has exactly the function we need: remove. First argument of
remove is a function which will be applied to each item in collection passed as a second element. If this function returns true, the element will be ommited from returned list. Keep in mind, that despite it's name,
remove does not mutate original list, it constructs completely new list instead. We can get the list of spam-free messages like this:
(remove spam? messages)
What's left is just printing them out.
println can help us with that, but since we need to print one message per line, we need to call
println on each message individually.
map function can do just that:
(map println (remove spam? messages))
map will apply
println on each element of
(remove spam? messages) and return the list consisting of all values returned by
map is often use to modify a collection in one way or another. In our example, however, that is not the case.
println it's is a non-pure function: we are calling it not because we are interested in its return value (which always will be nil), but because we want to trigger a side-effect, namely printing string to a standard output.
We are almost finished, but the program will not print anything just yet. Clojure, as well as many other functional languages, uses lazy-evaluation strategy for evaluating the code. What it means is that Clojure will not evaluate any function unless it's return value is required by other function. It works well for code consisting of pure functions, but in our case it pervents
map from being called (because we never use list of
nil's it retuns). To force the evaluation of
map we can use built-in doall function.
(doall (map println (remove spam? messages))))
Now if we put this code in
-main, we can execute our program with
lein run to ensure it actually works. Source code of finished application can be found on GitHub.
While the application we wrote demonstrates how experienced developers design functional code. Clojure's LISP-like syntax allows developer to be very concise in describing transformations on lists. In accordance with functional programming principles,
spam? is designed to be a pure function, while inherently non-pure input and output operation are handled in
I hope this tutorial was enjoyable, as well as educational.
Posted on Utopian.io - Rewarding Open Source Contributors