rant


3
Nov 09

i can’t stop you from being stupid

computer-stupid

I’ve been working on a skunkworks side project that is nearing release, I’m down to the last 5%, which is the most difficult part. Suffice it to say that this project is a library meant to be used by other developers (and myself) to develop neat and nifty things. One of the most complicated things is trying to define an API that makes doing the right thing easy and doing the wrong thing really hard or impossible. The problem is that the as I pressed on I realized that almost half my code was checking for some stupid thing and protecting against it, attempting to build up state machines and syntax trees in the hope of reporting to you that you had done something stupid, and then it dawned on me, I can’t stop you from being stupid.

The other thing that dawned on me is that I shouldn’t try to, there are already built in errors that will point out what you did wrong, I don’t need to add another layer on top. So I came up with some simple guidelines, applied it to my project, and watched unnecessary code melt away.

  • Pick sane defaults
  • Assume the developer knows what they are doing
  • Make it easy to do the right things
  • Make it hard to do the wrong things

The beautiful part of all of it is that my code became easier to read and understand, which means when there is a problem and you need to drop into my code (since it is open source), you will be able to understand what’s going on. Let’s take a look at what I mean with a contrived example, an HTML building library with 4 functions for clarity.

class HTML {
  static function open_html() {
    echo "<html>";
  }

  static function open_body() {
    echo "<body>";
  }

  static function close_body() {
    echo "</body>";
  }

  static function close_html() {
    echo "</html>";
  }
}

The way I was programming did a bunch of handholding and sanity checks, which were nice but unnecessary. The class became bloated with so much state and sanity checking that it was becoming unwieldy, let’s take a look at an overly protective incarnation of the above code.

class HTML {
  private static $html_open = false;
  private static $body_open = false;

  static function open_html() {
    if(self::$html_open) {
      self::handle_error("html is already open");
    }
    echo "<html>";
    self::$html_open = true;
  }

  static function open_body() {
    if(self::$body_open) {
      self::handle_error("body is already open");
    } else if (!self::$html_open) {
      self::handle_error("body must be contained in html");
    }
    echo "<body>";
    self::$body_open = true;
  }

  static function close_body() {
    if(!self::$body_open) {
      self::handle_error("cannot close unopened body");
    } else if(!self::$html_open) {
      self::handle_error("body must be contained in html");
    }
    echo "</body>";
    self::$body_open = false;
  }

  static function close_html() {
    if(self::$body_open) {
      self::handle_error("cannot close html, unclosed body exists");
    } else if(!self::$html_open) {
      self::handle_error("cannot close unopened html");
    }
    echo "</html>";
    self::$html_open = false;
  }

  static function handle_error($message) {
    //What is the correct behavior, should we attempt to fix the error, report it, who knows.
    //We'll just stop execution
    die($message);
  }
}

Those sure are some nice error messages and it keeps you from creating malformed html, but what’s the point of it all. The browser is more than happy to tell you that your markup is invalid, or run the output through a lint checker or W3C validator, why should this class have some buggy half-implemented validator inside of it. The answer is, it shouldn’t.

If you write stupid code you should get stupid results, Garbage In – Garbage Out. Library code shouldn’t hold your hand making absolutely sure that you never make a mistake, what’s the point of it. If you write the following code

HTML::open_body();
HTML::close_html();
HTML::open_html();

You are clearly missing something about how markup works, it’s not the library’s job to hold your hand and guide you through this crisis, the browser will slap you in the face and you will have to learn something. Now there is a caveat, here we are generating HTML, there are great utilities for finding errors in HTML, so we don’t need to reinvent the wheel. If you are writing a library or application code where if something goes wrong the error displayed is fine, don’t reinvent the wheel. If the error is unacceptable you may need to do some error reporting.

At the end of the day you can’t keep the user of your creation from being stupid, they are going to do stupid things and stub their toes and curse your name and you need to make sure that they had to go far off the beaten trail to do so, overlooking obvious better ways to do it, so that it’s their fault and not yours. Make it easy to do the right thing and hard to do the wrong thing, but if the user wants to open the body tag before the html tag, let them, they have to learn sooner or later.


7
Oct 09

just because its better doesn’t mean its good

I recently wrote a blog post called why no love for scripting languages lamenting the lack of open source scripting environments in Windows 7. I got some interesting feedback, most of which went along the lines of “POWERSHELL!!!1!!one!”

Being an interested fellow I took it upon myself to look up PowerShell, and it looks like a nice language, really good for administrative tasks. So I read some of the manual and it looked ok, and I started looking for some scripts that would compare old school .bat to the new PowerShell. I found what I was looking for here PowerShell Examples. We will be looking at an example that displays the current date.

@ECHO OFF
IF NOT "%1"=="" GOTO Syntax

:: Use BATCHMAN to retrieve day
BATCHMAN DAY
:: Errorlevel 0 means BATCHMAN was not found
IF NOT ERRORLEVEL 1 GOTO NotFound
FOR %%A IN   (1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL  %%A SET DD=0%%A
FOR %%A IN (0 1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL 1%%A SET DD=1%%A
FOR %%A IN (0 1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL 2%%A SET DD=2%%A
FOR %%A IN (0 1)                 DO IF ERRORLEVEL 3%%A SET DD=3%%A

:: Use BATCHMAN to retrieve month
BATCHMAN MONTH
FOR %%A IN (1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL  %%A SET MM=0%%A
FOR %%A IN (0 1 2)             DO IF ERRORLEVEL 1%%A SET MM=1%%A

:: Use BATCHMAN to retrieve year
BATCHMAN YEAR
FOR %%A IN (0 1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL  %%A SET YYYY=198%%A
FOR %%A IN (0 1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL 1%%A SET YYYY=199%%A
FOR %%A IN (0 1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL 2%%A SET YYYY=200%%A
FOR %%A IN (0 1 2 3 4 5 6 7 8 9) DO IF ERRORLEVEL 3%%A SET YYYY=201%%A

:: Store in variable and clean up temporary variables
SET SortDate=%YYYY%%MM%%DD%
SET YYYY=
SET MM=
SET DD=

:: Display the result
ECHO.
ECHO SortDate = %SortDate%
GOTO End

:Syntax
ECHO.
ECHO SortDate.bat,  Version 1.00 for MS-DOS
ECHO Display the current date in YYYYMMDD format
ECHO.
ECHO Usage:  SORTDATE
ECHO.
ECHO This batch file uses BATCHMAN, a utility by Michael Mefford
ECHO.
ECHO Written by Rob van der Woude
ECHO http://www.robvanderwoude.com

:End

Rough, if you’ve ever had to write anything non-trivial in batch you will start to feel that pain at the back of your eyes right now, this is your brain trying to eat your memories. Batch is hideous, and difficult to write, and gross, and everyone hates it. Don’t believe me, well you don’t have to, Microsoft agreed with me and began development on Monad in 2003, this project would become PowerShell. Now let’s look at the PowerShell equivalent of this code

""
"Date / Format   YYYYMMDD        DD-MM-YYYY        MM/DD/YYYY"
"============================================================"
"Yesterday       " + (get-date (get-date).AddDays(-1) -uformat %Y%m%d) + "        " + (get-date (get-date).AddDays(-1) -uformat %d-%m-%Y) + "        " + (get-date (get-date).AddDays(-1) -uformat %m/%d/%Y)
"Today           " + (get-date -uformat %Y%m%d)                        + "        " + (get-date (get-date)             -uformat %d-%m-%Y) + "        " + (get-date (get-date)             -uformat %m/%d/%Y)
"Tomorrow        " + (get-date (get-date).AddDays(1)  -uformat %Y%m%d) + "        " + (get-date (get-date).AddDays(1)  -uformat %d-%m-%Y) + "        " + (get-date (get-date).AddDays(1)  -uformat %m/%d/%Y)

That is much better. I see some stuff that looks like objects in there (I’m definitely seeing the dot operator). I’m a n00b to PowerShell and yet this code is easy to grok and all and all very pretty. Nicely done Microsoft, you get a cookie.

Well this would be a pretty boring post if I just went around patting Microsoft on the head for doing a good job. The interesting thing about PowerShell Examples is that they go on to provide the same example in other languages as well. Here is the same thing in perl.

#! perl

# SortDate.pl,  Version 1.00
# Display "sorted" date (YYYYMMDD)
# Written by Rob van der Woude
# http://www.robvanderwoude.com

# Parse time string
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);

# Add "base year"
$year = $year + 1900;

# Add 1, since moth seems to be zero based
$mon  = $mon  + 1;

# Add leading zeroes if necessary
if ($mon < 10) {
 $mon = "0".$mon
}
if ($mday < 10) {
 $mday = "0".$mday
}

# Concatenate substrings
$sortdate = $year.$mon.$mday;

# Display result
print "\nSortDate = $sortdate\n";

Certainly could use a text formatter of some kind to get rid of the icky manual padding, but for someone who has never written a line of perl, this looks quite nice and readable (although I am quite aware that that is not always the case for perl). So what’s got a bee in my bonnet? Well I almost didn’t write this until my friend Jeremiah tweeted the following

Writing powershell. This is happily like writing perl. 7:31am Oct 6th

And a little bit later

ZOMG This is seriously just like writing perl… I <3 you #PowerShell 10:41am Oct 6th

Side note: Jeremiah is about the smartest man in the world when it comes to SQL follow him here and view his blog facility9.

Why go about reinventing the wheel, if you are going to make a language similar to perl, just host perl. What’s the harm? Its hard to call it harm, but its inefficient and tastes a little bit too much like embrace, extend, extinguish to me. PowerShell becomes the defacto scripting language in Windows enticing open source programmers because it is similar to their language of choice. So why fret, PowerShell, perl, batch, bashscript, who cares? Well there are legion reasons why software developers should care.

  1. CPAN, PEAR, GEMS, etc. – These are huge repositories of tested code that can be leveraged quickly and easily by open source developers. CPAN (Comprehensive Perl Archive Network) contains 16,600 modules. PEAR (PHP Extension and Application Repository) contains 536 different packages with 1,255,213 lines of code. RubyForge hosts 8406 different projects. These established languages have a gigantic ecosystem of usable code.
  2. Learning curve – Do you know the intricacies of your chosen language? Did you spend two hours debugging a wily error and because of it have forever learned some dark corner of your language? Can you tell me in your sleep the difference between $foo and $$foo in php? Well none of that will help you in the new PowerShell.
  3. Brain drain – Come up with a really clever way to solve a problem in PowerShell, great, keep it to yourself. Where is the community? There is powershellcommunity.org. But there is yet to be an established authority for the community

Now these problems I outline are true of any emerging language. They are normally offset by some inherent positives in the language. I haven’t examined PowerShell in-depth enough to find out its intrinsic value. On first blush and with my limited exposure, it seems to be a competent enough pseudo object oriented scripting language well suited for administrative tasks. Nothing ground breaking, nothing that knocked my socks off, in the words of Homer Simpson:

I saw you desperately trying to cram one more salty treat into America’s already bloated snack hole. So I did what I could. I did what any loving husband would do! I reached out to some violent mobsters.

PowerShell is just one more salty treat that Microsoft is cramming into America’s already bloated snack hole. It would be fine if they had a level playing field and allowed other scripting languages to be first class citizens, but they don’t. After the mind numbing pain of batch scripting, PowerShell seems great. It really starts to lose its shine when you view it against the cornucopia of free mature open source scripting languages.

I want to end with a real world example (anonymized). There was a system that managed blerns. Blerns could be collected in collections called Bars, Bars could be collected in collections called Foos. These things were numbered and a system was built to manage them in COBOL. Because COBOL likes to make pictures, the original architects thought to themselves, we will pack the information into an 8 digit number like so FFBBbbbb, so that 01010001 would be the first blern, in the first bar, in the first foo, 01010002 would be the next and so on.

This worked great, the company was selling blerns left and right, and pretty soon foo:01 had 98 bars. Then things started to break down, 3 more bars, and the whole house of cards would come crashing down. What to do, what to do?! The engineers were assembled and solutions were offered.

We only have 8 foos, just assign the overflow into foo:10.

Seemed reasonable enough, but that would have resulted in weird code springing up all over the place like this

if foo == 1 or foo == 10
  return &quot;foo the first&quot;
else
  ...
end

Surely there was a better solution, what could it be? Well, suffice it to say I wasn’t present for these meeting and I only saw the terror of the last solution being halfway implemented and this new solution coming in. The solution was a technical sounding concept called field widening. It amounted to storing the identifiers like so FFFBBBbbbb, 10 digits now capable of holding up to 1000 Foos, 1000 Bars per Foo, and 10000 Blerns per Bar. Surely this was better, but was it any good. I guess it might be, as long as we don’t live in an exponential world.

Oh…