I ran across some beautiful code today.
I have the book Beautiful Code, though I've only read the chapter by Douglas Crockford on writing a simplified JavaScript parser using Top Down Operator Precedence (say that three times fast). And I skimmed the MapReduce chapter once. It's on my TODO list. Along with a lot of other things. So much awesome CS to do, so little ...
So that's not the beautiful code I'm referring to.
I spent the last few days going back to do some problems at 4Clojure. I finished about half of them a number of months ago and now it's time to make some progress on them again.
So I picked an "easy" one (so they claim). And got stumped. It was Problem 28: Flatten a Sequence. In the end, after only coming up with solutions involving mutable state (for shame), I decided to learn from the masters and look at the source code to clojure.core/flatten.
/* ---[ Beautiful Clojure, version 1 ]--- */
If you haven't seen it before, it's quite interesting:
It treats a nested collection, such as [1 2 [3 [4 5] 6]]
, like a tree, where the collection and its subcollections are branches (non-leaf nodes) and the values are the leaf-nodes.
It depends on tree-seq
to do most of the work. I won't reproduce it here but take a look to understand it better.
After grokking the idea of treating the data structure like a tree, my next question was "why does flatten have to filter the results?". Here's why:
test=> (def ds [1 2 [3 4 [5 "a"]]])
#'test/ds
test=> (tree-seq sequential? seq ds)
([1 2 [3 4 [5 "a"]]] 1 2 [3 4 [5 "a"]] 3 4 [5 "a"] 5 "a")
I see, tree-seq
returns both branches (with their children) and leaf-nodes. To get just the leaf-nodes, I have to filter out the branches.
That is a fascinating idea, but I have to admit it wasn't what I expected the code to flatten
to look like.
Then I solved the problem on 4clojure and was eager to see how others had solved it. I found two right away that immediately struck me as beautiful code.
/* ---[ Beautiful Clojure, version 2 ]--- */
The first, from nikkomega, uses mapcat
, which works like map
but calls (apply concat)
on the result. I am in awe of its elegance:
Use map
to iterate over each element in the data structure. If it is a "value", like 3, then return '(3)
. If it is a collection (more precisely implements the Sequential
Clojure interface), then recursively dive into that collection.
In the end the outer map returns a sequence: '('(1) '(2) '(3) '(4) '(5) '("a"))
(using the collection I listed above). Applying concat gives us the flattened sequence: '(1 2 3 4 5 "a")
.
Like clojure.core/flatten, this one also returns a LazySeq
, but we get that for free - both map and concat return lazy-seqs.
/* ---[ Beautiful Clojure, version 3 ]--- */
The second solution that caught my eye is from unlogic. In the end it is pretty much the same algorithm, but uses reduce to iterate over all the elements and concats the results as it goes:
/* ---[ Beautiful Groovy? ]--- */
So I wondered, could I do this in Groovy, a language that provides some nice functional programming features?
Here is my best shot:
It prints:
[1, 2, 3, 4, 5, a]
[1, 2, 3, 4, 5, a]
(Disclaimer: I'm still learning Groovy so if you see a better way, I'd be happy to hear it.)
inject
is (unfortunately) the name for reduce
in Groovy (and other languages, like Ruby). In Groovy, there is no mapcat
(that I know of) so I wrote it using inject
and +
, the latter being the equivalent of Clojure's concat
when applied to collections.
You'll notice that I am using mutable state here: the shovel operator, <<
, pushes onto and modifies in place the data structure I'm building up and the concat in flat3 has to be assigned back to x. I'd like to see a version that doesn't use immutable state, but perhaps even here it is justified? No less than Rich Hickey stated in his seminal "Are We There, Yet?" presentation that he's "not a fundamentalist, but rather a pragmatist": there are times when working with local variables in a contained setting that mutating state is a pragmatic and safe thing to do. Does that apply here?
Comments welcome.
No comments:
Post a Comment