Posts Tagged: code


25
Nov 09

refactor

google thinks this means refactor

google thinks this means refactor

I’ve been working a lot on prosper lately (this means that if you want to play around with it, get the bleeding edge on GitHub). I recently added a lazy loading feature, mostly to facilitate unit testing, and pretty soon I will be adding some unit testing. I was able to take care of some things recently that have bugged me for a while, and its because of ruthless refactoring.

Refactoring is not black magic, it just means taking some code and reworking it, rename a function here, encapsulate some functionality there, done. The sticky part is that refactoring is a compounding problem. Things that should be refactored build up on each other, so if you put off refactoring for too long you will have a real mess on your hands. Refactoring can be painful but should never be put off, you see a function named poorly but know that it is used in 20 places, get it now, don’t put it off, because by the time you get around to renaming it, it will be in more places.

A good IDE will help you in your refactoring process, the one that I love is (surprise, surprise) Eclipse. Eclipse is brilliant at refactoring, probably because it keeps semantic knowledge with symbol names. Rename the function foo to bar and you don’t have to worry about Eclipse destorying all those variables named foo (ps. if you have variables named foo you have refactoring to do!). Eclipse (and other IDEs) are great at all kinds of refactoring, renaming, extracting and encapsulating, pull up, push down, etc. Get to know these tools, get comfortable with using them, and refactor without fear.

Wait, change big chunks of code without fear, how!? Well you should have tests, be they unit or functional. I’m introducing unit tests into prosper, but for the time being I have a local version of todo list case study. The todo list application is pretty good because it exercises all the important bits, if I change something and everything in there keeps working, I haven’t broken anything too badly.

The reason I want to introduce unit tests though is that I’ve introduced regressions even with the todo list humming along, some dark corner of prosper has started shooting out unicorns, but I haven’t noticed, until I see some bit of code that I know can’t be doing the right thing. Test coverage allows you to refactor without fear, and if you want to have a living project that people might someday use, you need to refactor without fear to keep things current.


That’s it for the technical part of this blog post, I would like to announce a few things though.

  • The move to GitHub has been completed and I’m starting to get into the workflow of using it now.
  • The new skunkworks project is nearing a beta release, so keep an eye out.
  • Development on the new project has spurred improvements in prosper, there will be a point release (0.6) soon
  • There probably won’t be a post tomorrow as I’m traveling with Heather, my beautiful girlfriend, up to see my family in Cleveland.

That’s all for now, I will have something up Black Friday more than likely, until then, Happy Turkey Day everyone, safe travels, see you after I put another couple of pounds on.


10
Nov 09

convention vs convenience

convenience, american style

convenience, american style

I’m a huge believer in Convention over Configuration because it makes life easier and makes me more productive. I use to program in Java, and unless there is some seismic shake-up, I will soon be going back to Java. I like Java well enough, it’s no lisp or ruby, but it has its place in the business world. I have many gripes with Java, verbosity, complexity, etc, but when you are in the enterprise trying to work with a bunch of third party pieces cobbled together into a hulking software nightmare the worst, by far, is configuration.

It makes some business sense, if a customer won’t use your product because they want to change the tooltip on the help page and they can’t without hiring a Java programmer, or at all because you distribute closed source .jar files, the simplest solution is to toss a config file at them. Just change this or that setting in the config file and, look at that, the whole application is christmas colored and in sanskrit. There are a some huge problems with Java configuration though.

  1. The unfortunate pairing with XML – XML hit its high water mark around the same time as Java, maybe there was some reciprocal love there, and it can be maddening.
    <env-entry>
      <env-entry-name>maxExemptions</param-name>
      <env-entry-value>10</env-entry-value>
      <env-entry-type>java.lang.Integer</env-entry-type>
    </env-entry>
    

    Oh fuck me, are you serious?! I took that from the official Tomcat Documentation. It just sets maxExemptions = 10, but it takes 5 lines, 2 layers of nesting, 4 open tags, 4 close tags, and as you can see, this is a straight copy and paste from the official documentation and it has a pretty clear error. Clear to me because my eyes have been trained to read xml like a champion, the env-entry-name tag is closed by a param-name tag, that isn’t right.

  2. Undocumented DSL – Every configuration is basically an undocumented Domain Specific Language wrapped up in XML’s ugly ass clothing. There is little transfer of knowledge between Java and a Java Configuration file, or even between XML and a Java Configuration file.
  3. Undiscoverable – Maybe it would be more accurate to put hard to discover. Where do you go to figure out what belongs in your config, or what nodes configure what, your best bet is to hope the developer wrote up (and subsequently kept up to date) some documentation. Little to no help from your favorite IDE’s code completion and the constrained nature of the problem domain makes web searches less likely to yield helpful information.
  4. Twiddling – Like in field of dreams, if you make a configuration, they will come. Sure there is no good reason for X to be configurable, but I don’t like having hardcoded values in my code, so every hardcoded value is now configurable, and I’m on the slippery slope of softcoding now. This allows the end user too much power to twiddle around and configure things that don’t ever need to be played with, just because we can doesn’t mean we should.

This isn’t just Java’s problem, they are just the easiest to pick on because it seems like its everywhere. Then Ruby on Rails came onto the scene and made popular this idea, convention over configuration. I want to put my models in the fliggity directory, that’s too bad, they go in the model directory. I would like to name my table that stores user data ‘tc_people_datastore’, yea well I would like a billion dollars, you are going to call it users. This means that if tomorrow I’m told to go work on a RoR project, having never seen it, I would have a good idea how the project is laid out, where the data lives, and how everything is hooked together. This convention eases the mental load I have to carry around, replacing it with simple, sane rules.

Convention over configuration, especially the rails way, has been called opinionated software. The software, in this case rails, has an opinion about how things should be laid out, what your tables should be called, etc. I’m in the midst of writing my own software and API and I’ve decided that my software should have an opinion about stuff, but more importantly that things should follow certain conventions.

conventional breakfast

conventional breakfast

As a corollary to the conventions is a strive for consistency. Maybe a better title for this post would have been consistency vs convenience, but I’m already 700 words in, no going back now. I’d like a consistent API for a few reasons. Consistency makes it easy to remember, does this function go ($needle, $haystack) or ($haystack, $needle)? They all go ($needle, $haystack) calm down. Consistency makes wrong code look wrong, after seeing the same type of thing over and over again, the pattern gets burned in your brain, any deviation is obvious. I’m a little OCD, and making things consistent feels better.

The problem is that convention taken too far leads to an ugly little town called boilerplate, and no one wants to live there. This is where convenience comes in, it allows you to more-or-less follow convention but allow yourself an out to skip over the obvious parts. The problem is trying to strike an appropriate balance. I have a story of failure and redemption that I will quickly share with you.

My new side project is written in phpand makes use of the ubiquitous associative array when it makes sense to. I love me some php, but I hate the associative array literal syntax, and it’s not going to change anytime soon For those of you unfamiliar here is how you would make the same associative arrays, in json and php.

var example = {'a': '1', 'b': '2', 'c': '3'};
$example = array('a' => 1, 'b' => 2, 'c' => 3);

Doesn’t look too bad or different, but if you have to have nested arrays or use an array in a function signature (my case) it gets a ugly pretty quickly. I have also been reading up on lisp a lot recently and they have an idea that successive arguments can act as a pair. I thought this was a pretty nifty idea, so I set about creating an alternative calling convention.

$foo->bar(array('name' => 'Matt', 'age' => 23));

Would be identical to

$foo->bar('name', 'Matt', 'age', 23);

I thought that this looked much nicer, and it is fairly trivial to implement

class Foo {
  function bar($values) {
    if(func_num_args() > 1) {
      $values = self::associate(func_get_args());
    }
    ...
  }

  static function associate($args) {
    $count = count($args);
    if($count % 2 == 1) {
      $args[] = null;
      ++$count;
    }
    for($i = 0; $i < $count; %i += 2) {
      $result[$args[$i]] = $args[$i + 1];
    }
    return $result;
  }
}

I had done it, bam, lisp style associative arguments in php. The problem though is that, well, wtf? That is going to be the reaction to any php programmer unfamiliar with the lisp convention. I failed to follow the conventions of the language, so this morning I tore this code out. It added a secondary way to call a function, and it also introduces several edge cases, the benefit is also dubious. I was scratching my language implementer itch, but not in an appropriate fashion. This time convenience had to be sacrificed for convention’s sake.