Programming in Clojure. Part 4: Reader

in #utopian-io4 years ago (edited)

What Will I Learn?

In this part of the tutorial you will learn:

  • What is Clojure reader and how does it preprocess Clojure code.
  • How Clojure primitive data types differ from those in other languages.
  • How to employ syntaxic sugar provided by reader to make Clojure code more concise and expressive.

Requirements

To follow this part of the tutorial it might be benefitial to have:

  • Experience in any other programming languages, especially ones designed for fuctional programming. Ideally Lisp dialects.
  • Experience with functional programming
  • Good mood and desire to learn

Difficulty

  • Intermediate

Tutorial Contents

In previous part of the tutorial we discussed the Clojure's syntax in terms of simple mathematical expressions. This part of the tutorial will expand on that topic by covering the evaluation rules for Clojure's primitive data types, source code preprocessing features, variable value lookups and couple of particularly useful syntax features.

Curriculum

Reader

Evaluation of every Clojure programs relies on one the function read, which we have used in Part 3: Syntax and REPL to implement Clojure repl. As you might remember, it takes a character stream (source code) as an input and returns corresponding data structure (abstract syntax tree). Clojure has another very similar function read-string, which works almost the same except it takes string input instead of a stream. The component which is reasponsible for processing Clojure code into it's internal data structure representation is called Reader.

Since Clojure source code represents data structure so closely, it might confusing to see the difference between source code and corresponding internal data structure on trivial examples:

user=> (read-string "(+ 1 2)")
(+ 1 2)

Yet it is not the same thing:

user=> (read)
(+ 1
   2
   ;3
   4)
(+ 1 2 4)

As you can see, formatting and comments (which in Closure start from a symbol ;) are not a part of internal program representation.

It is important to understand that internal representation of the code (Abstract Syntax Tree) is not stored as a string internally. So when Clojure needs to print some data structure to command line, it need to deseriealize it back to string from it's internal representation. For this, Clojure uses functions pr and pr-str, which mirror functionality of read and read-string, just in opposite direction: pr takes and a data structures as arguments and prints them to standart output (returns nil), and pr-str returns output as string instead.

user=> (pr-str ["hello" (+ 1 1)])
"[\"hello\" 2]"

REPL-Reader

Reading Data types

Evaluation rules depend on the type of data read is evaluating. Most literals evaluate to themselves, as you would expect:

  • Strings: "hello world"
  • Booleans: true and false
  • nil: same as null in Java
  • Characters: 'a' or \u20ac (unicode representation).

Numbers

In case of numbers, things become a bit diffrent from what is used in Java and other languages. For one, Clojure provides syntactic sugar support for Ratios, arbitrary precision integer, arbitrary precision floating point numbers and integers specified with in custom radix:

  • Most commonly used integer type is long (64 bit precision). long number can be specified in a following ways:
    • 10 - classical representation
    • 0xff - hexadecimal
    • 2r11 - custom radix (base) for a number. In this example "11" treated as a number in base two, which means it represents integer 3.
    • 070 - octal (base 8) number
  • Most commonly used type for floating point numbers is double (64 bit preision):
    • 23.54
    • 4.9132205e18 - scientific notation
  • Aribitrary precision integer uses clojure.lang.BigInt type: 3519N
  • Aribitrary precision floating point number uses java.math.BigDecimal or java.math.BigInteger: 5.1052M
  • Ratios are of a type clojure.lang.Ratio: 22/7.

Abstract Syntax Tree does not care which option was used to representat a number. Once the number is specified, it's stored in it's binary form, so once pr-str reconstructs it into string, the number will be displayed in it's default representation:

user=> (pr-str [0xff 2r11 010 1e1 5/10])
"[255 3 8 10.0 1/2]"

Note that last element of a vector has "changed" from 5/10 to 1/2. That's because 5/10 and 1/2 are essentially different representations of the same number. Default representation of a ratio is it's simplest form, so pr-str will choose to use 1/2.

Keywords

Keywords are not something that can be commonly found in Object Oriented languages. Keywords in Clojure start from : symbol and evaluate to themselves.

Clojure collections can use any value as a key, but keywords are the most convinient entity for this role, since keywords can be used as functions to fetch corresponding value from the collection:

user=> (def movie {"name" "The Matrix" "year" 1999})
#'user/movie
user=> (get movie "name")
"The Matrix"
user=> ("name" movie)

ClassCastException java.lang.String cannot be cast to clojure.lang.IFn  user$eval1826.invokeStatic (:1)
user=> (def movie {:name "The Matrix" :year 1999})
#'user/movie
user=> (:name movie)
"The Matrix"

Symbols

Symbols are "boxes" that hold values, and they evaluate to those values. Let's take following example:

(def x 3)
(defn add2 [y] (+ y 2))
(add2 x)
  • x is a symbol. When (def x 3) this symbol gets a value of 3.
  • add2 is a symbol. Once add2 is evaluated, it gets function as a value.
  • y is a symbol. It gets its whalue whenever function is called.

Reader transformations

Reader does much more preprocessing that it might seem until now. One example of syntactic sugar provided by reader are concise anonymous function notation. Usual way to define anonymous function looks like this:

(fn [x] (+ x 1)

It's very similar to named function, except it uses fn instead of defn, and does not need a symbol for a name. Another way to define the same function is #(+ x %), which is a bit cryptic, but much more concise. Notation #() tells reader to expand the code into anonymous function. % symbol is a placeholder for an argument value. If we feed this code to read-string, we can see how it was transformed by reader:

user=> (read-string "#(+ x %)")
(fn* [p1__1900#] (+ x p1__1900#))

fn* is an internal function of Clojure compiler, which in the end defines a function in much the same way fn does. % was expanded into symbol p1__1900#. In practice, this function is identical to (fn [x] (+ x 1).

Another simple trick that reader employs for us, is transforming forms like 'x into (quote x). quote function takes a form and returns it without evaluting it:

user=> (def x 101)
#'user/x
user=> x
101
user=> (quote x)
x
user=> (+ 1 3)
4
user=> (quote (+ 1 3))
(+ 1 3)
user=> (read-string "'(+ 1 3)")
(quote (+ 1 3))

Comments

Single line comments in Clojure start with ; symbol and work similar to single line comments in any programming languages. However, Clojure also supports unique form-level commenting mechanism: #_ reader macro. It forks as follows: any expression prepended with #_ symbols will be ignored by reader:

user=> (read-string "(+ 2 3 #_(* 10 10 \"Hello World\") 4)")
(+ 2 3 4)

#_ may be useful in situations where you would use multiline comments in other languages, but it is not identical. Clojure needs to be able to understand the syntax inside commented out form to know where that form ends, so commented out syntax errors will still cause an error in a runtime:

user=> (read-string "(+ 2 3 #_( ] ) 4)")

RuntimeException Unmatched delimiter: ]  clojure.lang.Util.runtimeException (Util.java:221)

Another way to stop the form evaluation is comment form. comment ignores its arguments and returns nil, so user must be careful not where to use it:

user=> (do (println "Using comment, inside the do-form is perfectly fine") (comment (* 10 10)))
Using comment inside the do-form is perfectly file
nil
user=> (+ 1 2 (comment (println "But it will cause problem if return value is used by mistake")))

NullPointerException   clojure.lang.Numbers.ops (Numbers.java:1013)

Summary

Clojure provides many tools which can be used to make your code more beatiful and readable. Professional developer is expected to understand, to some extend, how those tools operate internally. Clojure's learning curve might be steep at times, but equipped with valuable knowledge of compiler's inner workings, programmer can be amazingly efficient in his work.



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

Heads up to the moderators:

I'm aware of this rule:

Submissions containing substantial instruction in ubiquitous functions (Save, Open, Print, etc.) or basic programming concepts (variables, operators, loops, etc.) will be rejected.

Please take into account that while this tutorial describes concepts which concern variables and primitive datatypes, it describes particular way Clojure treats this concepts and not concepts themselves. The information in the tutorial is definitely not trivial and not self-evident for people not familiar with LISP programming. Thanks.

Thanks for the contribution.


Need help? Write a ticket on https://support.utopian.io.
Chat with us on Discord.

[utopian-moderator]

Hey @laxam! Thank you for the great work you've done!

We're already looking forward to your next contribution!

Fully Decentralized Rewards

We hope you will take the time to share your expertise and knowledge by rating contributions made by others on Utopian.io to help us reward the best contributions together.

Utopian Witness!

Vote for Utopian Witness! We are made of developers, system administrators, entrepreneurs, artists, content creators, thinkers. We embrace every nationality, mindset and belief.

Want to chat? Join us on Discord https://discord.me/utopian-io

Loading...

Congratulations @laxam! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 1 year!

Click here to view your Board

Support SteemitBoard's project! Vote for its witness and get one more award!

Congratulations @laxam! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 2 years!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Vote for @Steemitboard as a witness to get one more award and increased upvotes!

Coin Marketplace

STEEM 0.31
TRX 0.06
JST 0.042
BTC 38261.51
ETH 2641.96
USDT 1.00
SBD 4.08