Posts Tagged: api


10
Nov 09

convention vs convenience

convenience, american style

convenience, american style

I’m a huge believer in Convention over Configuration because it makes life easier and makes me more productive. I use to program in Java, and unless there is some seismic shake-up, I will soon be going back to Java. I like Java well enough, it’s no lisp or ruby, but it has its place in the business world. I have many gripes with Java, verbosity, complexity, etc, but when you are in the enterprise trying to work with a bunch of third party pieces cobbled together into a hulking software nightmare the worst, by far, is configuration.

It makes some business sense, if a customer won’t use your product because they want to change the tooltip on the help page and they can’t without hiring a Java programmer, or at all because you distribute closed source .jar files, the simplest solution is to toss a config file at them. Just change this or that setting in the config file and, look at that, the whole application is christmas colored and in sanskrit. There are a some huge problems with Java configuration though.

  1. The unfortunate pairing with XML – XML hit its high water mark around the same time as Java, maybe there was some reciprocal love there, and it can be maddening.
    <env-entry>
      <env-entry-name>maxExemptions</param-name>
      <env-entry-value>10</env-entry-value>
      <env-entry-type>java.lang.Integer</env-entry-type>
    </env-entry>
    

    Oh fuck me, are you serious?! I took that from the official Tomcat Documentation. It just sets maxExemptions = 10, but it takes 5 lines, 2 layers of nesting, 4 open tags, 4 close tags, and as you can see, this is a straight copy and paste from the official documentation and it has a pretty clear error. Clear to me because my eyes have been trained to read xml like a champion, the env-entry-name tag is closed by a param-name tag, that isn’t right.

  2. Undocumented DSL – Every configuration is basically an undocumented Domain Specific Language wrapped up in XML’s ugly ass clothing. There is little transfer of knowledge between Java and a Java Configuration file, or even between XML and a Java Configuration file.
  3. Undiscoverable – Maybe it would be more accurate to put hard to discover. Where do you go to figure out what belongs in your config, or what nodes configure what, your best bet is to hope the developer wrote up (and subsequently kept up to date) some documentation. Little to no help from your favorite IDE’s code completion and the constrained nature of the problem domain makes web searches less likely to yield helpful information.
  4. Twiddling – Like in field of dreams, if you make a configuration, they will come. Sure there is no good reason for X to be configurable, but I don’t like having hardcoded values in my code, so every hardcoded value is now configurable, and I’m on the slippery slope of softcoding now. This allows the end user too much power to twiddle around and configure things that don’t ever need to be played with, just because we can doesn’t mean we should.

This isn’t just Java’s problem, they are just the easiest to pick on because it seems like its everywhere. Then Ruby on Rails came onto the scene and made popular this idea, convention over configuration. I want to put my models in the fliggity directory, that’s too bad, they go in the model directory. I would like to name my table that stores user data ‘tc_people_datastore’, yea well I would like a billion dollars, you are going to call it users. This means that if tomorrow I’m told to go work on a RoR project, having never seen it, I would have a good idea how the project is laid out, where the data lives, and how everything is hooked together. This convention eases the mental load I have to carry around, replacing it with simple, sane rules.

Convention over configuration, especially the rails way, has been called opinionated software. The software, in this case rails, has an opinion about how things should be laid out, what your tables should be called, etc. I’m in the midst of writing my own software and API and I’ve decided that my software should have an opinion about stuff, but more importantly that things should follow certain conventions.

conventional breakfast

conventional breakfast

As a corollary to the conventions is a strive for consistency. Maybe a better title for this post would have been consistency vs convenience, but I’m already 700 words in, no going back now. I’d like a consistent API for a few reasons. Consistency makes it easy to remember, does this function go ($needle, $haystack) or ($haystack, $needle)? They all go ($needle, $haystack) calm down. Consistency makes wrong code look wrong, after seeing the same type of thing over and over again, the pattern gets burned in your brain, any deviation is obvious. I’m a little OCD, and making things consistent feels better.

The problem is that convention taken too far leads to an ugly little town called boilerplate, and no one wants to live there. This is where convenience comes in, it allows you to more-or-less follow convention but allow yourself an out to skip over the obvious parts. The problem is trying to strike an appropriate balance. I have a story of failure and redemption that I will quickly share with you.

My new side project is written in phpand makes use of the ubiquitous associative array when it makes sense to. I love me some php, but I hate the associative array literal syntax, and it’s not going to change anytime soon For those of you unfamiliar here is how you would make the same associative arrays, in json and php.

var example = {'a': '1', 'b': '2', 'c': '3'};
$example = array('a' => 1, 'b' => 2, 'c' => 3);

Doesn’t look too bad or different, but if you have to have nested arrays or use an array in a function signature (my case) it gets a ugly pretty quickly. I have also been reading up on lisp a lot recently and they have an idea that successive arguments can act as a pair. I thought this was a pretty nifty idea, so I set about creating an alternative calling convention.

$foo->bar(array('name' => 'Matt', 'age' => 23));

Would be identical to

$foo->bar('name', 'Matt', 'age', 23);

I thought that this looked much nicer, and it is fairly trivial to implement

class Foo {
  function bar($values) {
    if(func_num_args() > 1) {
      $values = self::associate(func_get_args());
    }
    ...
  }

  static function associate($args) {
    $count = count($args);
    if($count % 2 == 1) {
      $args[] = null;
      ++$count;
    }
    for($i = 0; $i < $count; %i += 2) {
      $result[$args[$i]] = $args[$i + 1];
    }
    return $result;
  }
}

I had done it, bam, lisp style associative arguments in php. The problem though is that, well, wtf? That is going to be the reaction to any php programmer unfamiliar with the lisp convention. I failed to follow the conventions of the language, so this morning I tore this code out. It added a secondary way to call a function, and it also introduces several edge cases, the benefit is also dubious. I was scratching my language implementer itch, but not in an appropriate fashion. This time convenience had to be sacrificed for convention’s sake.


9
Nov 09

api design

blinkenlight interface

blinkenlight interface

I’m still working on my skunkworks side project, over the weekend I had the joy of integrating several third party php libraries. I got to spend a good amount of time on php.net reading over APIs and figuring out how to fit them into my project. Some of them were sublime, as though the author had read my mind and knew my exact mental model. Some of them were abominations, fighting me all the way. This got me to thinking about the design of a good API

What makes an API good? There are a few things that make an API really nice to work with.

  1. Similarity of behavior – Writing an API that does searching through a b-tree? Look at how searching is implemented for arrays or strings, and then copy the crap out of that API. This allows the developer to use all that knowledge they’ve built up about searching, so if I know
    array_first($needle, $haystack)

    returns the first instance of $needle in $haystack or FALSE on failure to find $needle, then

    btree_first($needle, $haystack)

    should work the same way.

  2. Readability – Your API should make code that is readable, the function names should be descriptive (without being overly verbose), and code written with it should flow nicely. Avoid using difficult to pronounce function names like strcspn.
  3. Minimalism – You’re writing an API because you are doing something non-trivial, something complex enough that you want a simple looking API to interact with, so do exactly that, make it simple. I’m sure that reflangulating the zyffer is a complex process that involves juxtaposing the allibaster and repeppering the kilgore while making sure not to narfle the garthok, but the reason you are writing an API is to hide that complexity away, don’t write a Leaky Abstraction. Allow me to write code like this
    $zyffer = new Zyffer();
    $zyffer->reflangulate();
    

    Not like this

    $zyffer = new Zyffer();
    $zyffer->juxtapose(Zyffer::allibaster);
    $zyffer->denarfle(Zyffer::garthok);
    $zyffer->repepper(Zyffer::kilgore);
    $zyffer->reflangulate();
    

In the course of writing the API for my side project I’ve found it useful to put myself in the shoes of a new programmer trying to use my API. How long would it take them to figure out X? How often would they curse my name? What is the WTFs/min ratio looking like? It has been a helpful tool to adopt that mindset and ask myself, why am I requiring this parameter, why do they have to call this function before that function, and would that be apparent, why am I making their life so difficult. It has helped my slim down my API considerably, this combined with Convention over Configuration has led to an API approaching “not terrible.”

Then it hit me, I have stumbled upon a big important rule, that I’ve implicitly been following for years, but now my brain is aware of it.

You should write everything like it will one day be a public API

Of course, like any rule that is written in absolutes, there are sure to be exceptions. But I think on the whole, it will serve you well for a few reasons. Public APIs are written to be simple to work with, which means that after you’ve encapsulated all the hard complicated stuff you can interact with a nice clean API. This will make your life nice when working with your API, but the best part is maintenance. Remember that new programmer we imagined to help write the API, that will be you, intrepid API writer, just a few short months after moving on from the API. You will inevitably be called back to add a feature or fix a bug, and if your API is simple to pick up, then you will remember more of it, and relearn the parts you forgot that much quicker.

Another advantage is that a well-written API is much more likely to encapsulate well-written code. When your API is separated nicely and parsed up cleanly between well defined units of work, the code powering them will probably have well defined separation of labor and understandable flow. The API itself becomes the documentation for how a process is accomplished, the various actors nicely laid out as well defined classes and the behaviors as fancy-pants interfaces. I’m certain you can create a clean API over a horrible pile of spaghetti code, just as surely as you can create a crap API over a beautiful collection of clean OOP, but a good API encourages good code.

At then end of the day the API is the face of your code. You can write up all the pretty documentation and how-to’s and promotional websites with sweet web 2.0 reflection and fun oversized graphics, but when it gets down to brass tacks the developer is going to be instantiating your objects and calling your functions. The fact that there is a pretty floating cloud icon isn’t going to make a developer feel any better that she just spent 4 hours figuring out that she forgot to juxtapose the allibaster and that made shit hit the fan when she called the promulgate function on the zyffer’s subclown. Make sure that your code has a pretty face, even if its only you using it, the benefits will far outweigh the minor upfront costs