Wednesday, December 26, 2012

My CS/Programming Top 10 for 2012

As many will do and I did last year, I looked through my notes, projects, tweets, blog entries and personal wiki to assemble the highlights of my year on all topics computer science and software engineering.

These are the top 10 "things" that added the most to my knowledge, impressed me as excellent tools, and added to the joy of being a software developer. Here's my list in no particular order:

  1. Security Now podcast
  2. Chrome Dev Tools
  3. Emacs 24
  4. Groovy
  5. Clojure
  6. z (rupa/z)
  7. Xubuntu
  8. Coursera
  9. Fiddler2 and Wireshark
  10. TrueCrypt and other encryption tools
  11. One that should be on the list: Datomic

/* ---[ 1. Security Now podcast ]--- */

In March I did a short blog entry on the podcasts I was listening to or had heard of and wanted to try out. Good technical podcasts are a gold mine of information that you can use to fill the interstices of your day - while commuting, cleaning the kitchen or taking a walk.

The podcast that had the biggest impact on me in 2012 is Security Now, done by Steve Gibson and Leo Laporte, one of the flagship podcasts of the network.

To date, they have done 384 episodes, starting in August 2005 and the vast majority are still relevant and worth listening to. You can download them from Steve's GRC Security Now website.

The focus, of course, is on computer security, mostly for the individual user, not at the corporate level. The podcast also covers networking theory and practice in great detail, since the network is mainly how malware spreads and is a vast attack surface for it. For example, this year Steve did a deep dive into SPDY, the networking protocol developed by Google to speed up the web by reducing page load time by overcoming the TCP slow start problem.

And there's plenty of focus on cryptography and security. A highlight of the year was Steve's episode on Elliptic Curve Crypto, a crypto technology that will likely be used more heavily in days to come.

In addition, you learn a lot about how hard drives work, since Steve wrote SpinRite, a disk maintenance and recovery utility, which I use for maintenance on my spinning disks.

Also, starting with episode #233 ("Let's Design a Computer, part 1"), Steve does a 8+ episode series on the basics of how computers work, including what machine language is, how assembly language works, the role of stacks and registers, hardware interrupts and RISC vs. CISC architectures. You can learn (or be refreshed on) in surprising levels of detail for an audio-only medium. Steve is very good at explaining this stuff.

This year, while keeping up with the weekly new broadcasts, I went back and started at episode 1. At this point I've listened to about half of the episodes, so this will continue to be my mainstay into 2013.

/* ---[ 2. Chrome Dev Tools ]--- */

This year I got back into JavaScript programming. I remember the horrible days of debugging by alert statements, which contributed to the general consensus that JavaScript was a toy language and a piece of ill-thought out crap. Despite its warts, a result of its ridiculously over-short allowed design period, Brendan Eich created a rather fascinating and powerful programming language. Even though we like to complain about its issues, I agree with Crockford that given the conditions under which it was developed, we got better than we deserved and Mr. Eich is to thank for that.

So I was pleased to discover the awesome Chrome Dev Tools for browser based JavaScript development. JavaScript debugging can actually be pleasurable. Some resources to get you started if you aren't using it:

/* ---[ 3. Emacs24 ]--- */

Emacs is alive and well. In fact it is thriving more than ever. I've been a long time user of emacs and I use it for everything except Java (which really needs a full IDE).

Emacs 24, released this year, is a great text editor. I use it on both Linux and Windows, the exact same set up on both.

Most notably Emacs has package repositories now. Three, in fact, that I know of:
* ELPA, which is maintained/sponsored by GNU and has only the core emacs packages that adhere to the copyleft licensing model of the Free Software Foundation.
* Marmalade
* MELPA, which is where most of the bleeding edge work goes.

I used ELPA and MELPA by default, but I sometimes switch over to Marmalade if it has something not on the others. Generally MELPA and Marmalade seem to have the same packages, though MELPA often has the most recent. To make things confusing, MELPA moved to a date-based versioning system, like "20121206.1504", rather than the more traditional major.minor versioning system, such as "0.19".

There is a still a big learning curve to emacs and some things are still pretty esoteric (I still have trouble getting themes to work), but when people ask me why I use emacs I say: "if programming is your main career and hobby wouldn't you want to use the most powerful tool available? It's worth the few months of learning to enjoy the benefits for the rest of your life." But isn't emacs "old"??? (as if that's a bad thing) Seriously, when I use emacs I feel like I'm tapping into some of the collective wisdom of our programming culture from the last 30 years.

And no disrespect to vim. I like vim too. Pick one of those two and learn it. Stop using Notepad++ or worse.

A few emacs highlights from my year:

  • I love the nrepl package for Clojure. Now I can use those fancy keystrokes to auto-evaluate Clojure s-expressions. With the ac-nrepl package, it has code completion and will show you the argument signature for functions in the minibuffer! Some IDE-like goodness right there.

  • paredit. When I talk to people about Clojure (or Lisp in general), I sometimes get the story of how horrible it was balancing parentheses at 3 in the morning the day their CS class assignment was due. I am happy to announce to anyone that doesn't know: that problem is solved. It's name is paredit. Here is the slide deck I originally learned it from.

  • Learn to use emacs macros in two ways:

    • named macros you'll use a lot and save in your init.el (or macros.el if you want a separate file).
    • temp unnamed macros to automate some task you need to do some one-time repetitive thing, say, 10 times in a file. This EmacsRocks video shows a great example of that.

/* ---[ 4. Groovy ]--- */

When I was first learning Ruby, many years ago, I remember experiencing Matz' principle of least surprise. Once you learned the basic gist of Ruby and its blocks and classes, you could often just guess how to do something or what a method would be called and it would work. It was a very satisfying experience.

This year I joined a new company and they have largely adopted Groovy as their primary scripting language. As I jumped in to learn it, I had that deja vu feeling of learning Ruby, this time wrappering the Java language we know and love.

For example, I started to write a Groovy script that would have to recursively traverse a directory structure, and I remembered the pain of doing this in Java with its FilenameFilters and other APIs you had to learn to get anything done. I said to myself "I hope Groovy has made this easier". Holy smokes, they wrapped to have an eachFileRecurse method that takes a closure:

new File('.').eachFileRecurse {
  if ( =~ /.*\.txt/) println it;

There is also an eachDirRecurse and variations where you can pass in a file type filter.

The more I learn about Groovy the more I like it. In fact, the "groovy-JDK" is one of my favorite things: The Groovy creators and contributors have wrapped a large number of the Java classes, using the Groovy metaclass concept, and given them additional useful methods. Such as:

  • String now has an eachLine method and versions of replaceAll and replaceFirst that take a closure, allowing arbitrarily complex logic to be executed to determine the replacement string.

  • Map now has an any method that takes a predicate closure to see if at least one entry passes the predicate test. It also now has map and reduce, though the authors unfortunately followed Ruby in calling them collect and inject respectively.

  • And thank the gods, they wrappered the horrible java.util.Date class and made it more useful.

It provides many functional programming constructs, such as closures (the lambdas of Groovy), immutable data structures, higher order functions and very importantly: regex, list and map literals, akin to JavaScript or Closure literals (though the map literal syntax is different in Groovy).

With GStrings you get string interpolation and multi-line strings. And Groovy gives you simpler syntax for accessing getters and setters - you grab them like properties.

In short when you are hacking out large swaths of boilerplate in Java, using tedious syntax to do stuff with Maps, Lists, Regular Expressions and a variety of other things, you constantly think to yourself, "man I wish I could be doing in this in Groovy". Groovy makes programming a pleasure.

I'm still learning it and look forward to using it for years to come.

/* ---[ 5. Clojure ]--- */

And speaking of bringing the joy back to programming, Clojure is a combination of elegance, joy and ... wait a minute, how do I do this in Clojure? I ran across someone who described himself as a "perennial Clojure beginner". I can identify with that. Since I don't come from a Lisp or functional programming background, the last year learning Clojure has been like learning to ride a bicycle again. Except this bicycle is tricked out and has gears, knobs and restrictions that are different from the other bicycles.

I've started proselytizing co-workers about Clojure. I get the "why Clojure?" question a lot, so here is my version:

  • Combines the best of Lisp, such as macros, and Java/JVM, such as its world class garbage collector (which a language built on immutable data structures needs)
  • Brilliant design for immutable data structures that is now being adopted by other languages (Scala for one)
  • Functional programming model, but with practical bent (not Haskell, but more pure than Common Lisp)
  • STM: software transaction memory -- brilliant solution to shared mutable state
  • Designed for concurrency (in a couple of different ways)
  • A fast dynamic language: faster than ruby and python, comparable to Java in many areas and can drop into Java easily when performance is the most important thing
  • ClojureScript: bring the power of Clojure macro writing, namespaces and better syntax to doing your JavaScript work
  • Data centric (like lisps), but even better by being abstraction centric
  • Clean design for solving the “expression problem”:
  • Separation of concerns – an overall philosophy to tease things apart into simple (non-completed) pieces:
    • Example: polymorphism is not tied to inheritance
  • Simple and elegant syntax. For example, I find Scala to be powerful but overwhelming and confusing in its approach to syntax and expression
  • Community:
    • Small focused libraries (separation of concerns, non-complected)
    • Datomic => one of the greatest examples of separation of concerns there is
    • Core.logic => modern logic programming easily integrated into your program
  • Finally, an argument ad hominem: Rich Hickey. You need to watch the series of presentations he’s made over the past 5 years (perhaps one every week as Bodil suggests). Unquestionably the most impactful thinker in CS I’ve ever encountered. Even if you end up not agreeing with all of his views, you will learn a lot and think about things in a different way, possibly changing the way you think about our craft.

Finally, as a coda to this paean to Clojure: The O'Reilly Clojure Programming book came out this year. Chas Emerick, Brian Carper and Christophe Grand have written a fantastic book. It is a book you will learn from and come back to for its insights, examples and reference material for many years. Definitely belongs on my top 10 for 2012 list.

/* ---[ 6. rupa/z ]--- */

The z shell script (not zsh) is one of my favorite discoveries of 2012. To give it more press, I gave it its own blog entry, which you read here:

Here's the short summary: z is a 200-line shell script compatible with bash and z-shell that is a clever LRU-type cache of your directory visitations - the cache weighting is based on both frequency and recentness, which the author dubs "frecency". As you navigate around to different directories, it keeps track of where you've been, how often you've been there and how recently.

To navigate somewhere you've been, pass a part of the path to the z command and it will take you to the highest weighted directory in your cache.

/* ---[ 7. Xubuntu ]--- */

I'm a Linux guy. I was on the Ubuntu bandwagon for many years. I played with Linux Mint a little. I've got Fedora and CentOS running in VirtualBox VMs. But when Unity came out on Ubuntu, I struggled to get used to its desktop model. It does not fit how I work. I tried it for a month and was considering what to switch to when I saw a Slashdot article that Linus Torvalds was adopting XFCE to get away from the strangeness of many modern Linux desktop environments.

So that prompted me to try Xubuntu, based on XFCE and also Lubuntu, based on the LXDE desktop. Lubuntu was a little too minimal for me, but Xubuntu clicked for me right away. I don't like the Dash of Unity and I really really hate the fact that when I try to open a new console shell it brings the current one to the forefront. That is not what I want. I'll use Alt-Tab for that.

Xubuntu behaves as you expect. Click the terminal icon and it opens a new terminal. Xubuntu puts shortcut icons on the bottom, similar to Apple's desktop, but without the annoying enlargement animations. I don't do a lot of customization of my desktop. I just want one that has sane defaults and Xubuntu is that for me.

Ubuntu also stirred up criticism for its integration with Amazon affiliated advertisements, making the Dash a purchasing platform, in the process creating data leaks. Now you don't even have privacy when operating your desktop. The EFF write-up summarizes this nicely.

You can turn it off, but even among Linux users I suspect the "tyranny of the default" will mean that most users are leaking data and thus are at the mercy of Canonical, which people are starting to develop some mistrust for.

Well, Xubuntu doesn't have Dash. So you get the goodness of the Ubuntu ecosystem without the privacy violations. Its defaults are sane.

Try it out.

/* ---[ 8. Coursera and Online Education ]--- */

2012 is the year that online education skyrocketed. I've done a few CodeSchool courses and enjoyed those. But now there's Udacity and Coursera and Udemy and edX and probably 10 more I don't know about.

This year I took a Coursera course: Functional Programming Principles in Scala taught by Martin Odersky. It was a great experience. The format is excellent - each week there is about 2 to 3 hours of video lectures and a programming assignment that takes anywhere from 5 to 15 hours to complete. The examples were challenging enough to make the time investment worth it. And I got a nice certificate at the end for having a passing grade.

Uploading assignments was done via a command in Scala's sbt tool; it was easy and seamless. The assignments were graded automatically in about 10 minutes and gave good feedback, allowing you to fix problems and resubmit. The only part of the course I didn't enjoy was using the Scala Eclipse IDE, which is still quite painful compared to Java in Eclipse or Clojure in Emacs.

It's amazing what you can get online for free these days. I've signed up for two more courses and have my eye on a cryptography course there as well.

/* ---[ 9. Fiddler2 and Wireshark ]--- */

I spent a good deal of time this year maintaining and enhancing a large "legacy" web app written that uses Ajax calls to communicate with the Java back-end. In many cases, the shortcut to figuring out what is going on is to watch the traffic between the browser and server. Fiddler2 is an invaluable tool for that.

I also tried Wireshark, but the output from Fiddler2 is just as intuitive and easy to follow as can be, since it focuses only on HTTP traffic.

Wireshark is more general purpose. I started learning it this year and want to get better at configuring and customizing it, so I can use it effectively (and efficiently) on Linux, since Fiddler2 is unfortunately a Windows-only product.

/* ---[ 10. TrueCrypt, GPG and other encryption tools ]--- */

If you aren't using encryption for your files, hard drives and passwords, make it your new year resolution to learn the tools. Ever since Phil Zimmerman bravely pioneered encryption for the everyman, the suite of tools available to do this have gotten better and better.

I use GPG to encrypt individual files, TrueCrypt to encrypt thumb drives and external drives and Ubuntu's full disk encryption for my laptops. If you have a laptop and thumb drives, they should be encrypted.

A nice file encryption tool on Windows is AxCrypt.

For passwords, I use LastPass, which I believe does it all correctly and securely in a "trust no one" fashion.

Consider using an encrypted "Trust No One" backup and file syncing service. Dropbox is not encrypted, nor is SkyDrive or Google Drive or many other popular services. Do not upload anything to those systems that you wouldn't mind having broadcast on the internet or at least read by employees of those companies.

Steve Gibson (of the Security Now podcast) did a multi-episode analysis of backup and file syncing services from an encryption and "trust no one" perspective. Start with episode #349. There are a number of good solutions. I use SpiderOak on Linux.

If you already know and do this stuff, have a CryptoParty in your area. If you live in my area (Raleigh/Durham, North Carolina, USA), join the DC919 group.

/* ---[ Datomic: Mine goes to 11 ]--- */

While I did attend a Datomic training course this year and wrote a fairly long blog post about it, I just haven't made the time to really study it yet. I fully intend to, as I think it is one of the most profound and important things to have come out in 2012. I've queued it up to be on my top 10 list in 2013.

No comments:

Post a Comment