Return of the Das Keyboard

About a year and half ago I got a Das Keyboard Ultimate for work, but immediately had some major issues using it.  Fortunately, the 10-year-old keyboard it was replacing wasn't completely unusable, so I just switched back.  That keyboard finally gave up the ghost a couple weeks ago, unfortunately, so I was stuck with the Das Keyboard.  And even though they say it'll work across a PS/2 adapter (though not the built-in USB hub, of course), I couldn't get my machine to recognize it through PS/2.  Not sure what the issue is there.

After a couple weeks, I've actually adapted most of the common "scan-inverse pairs" and type them more slowly to avoid errors.  Unfortunately, there are a lot of such pairs, and some of them occur fairly infrequently (e.g., "oa").  So I still have a lot of screwups with the infrequent ones, but the common ones (e.g., "th") have been addressed with muscle memory.  Kind of pisses me off my "ultimate" keyboard has trained my muscles to type sssllllooooowwwwweeerrrr instead of faster.  But whatever.  Probably the most damning inversion, however, is due to BACKSPACE being late in the scan order.  So if you have a one-character typo and hit BACKSPACE followed by the correct character (especially a left hand character) in rapid succession, it almost invariably types the new character first and immediately deletes it, leaving the typo untouched.  Arrrgh.

I've also noticed a suprising amount of wear on the keys for a couple weeks usage.  The spacebar has already completely lost it's texture along the bottom edge and the letters (especially the home row) are noticeably polished.  Same for BACKSPACE (which I end up using a lot now), ENTER, and the left-side SHIFT, CTRL, and ALT keys.

On the positive, the action is really nice.  I'm a fan.  I think the weighting of the keys is a little light, but I type hard.  The keyboard as a whole is also really heavy so it doesn't slide around on my desk, particularly when my right hand is dancing around between letters, the numeric keypad, and the arrow keys.

Their site says that they've redone the electronics to not have such issues with concurrency, so hopefully that's true.  It looks like they may have also reved their switches, which probably will change the action to some degree, as well as making a less clicky version available.  I'm certainly not going to spend $130 for a next gen version without getting to test drive the new hardware first, but in the meantime I'll probably keep using this one.

A Word About Development Environments

I saw this today, and thought it hilarious:

Java, being a mainstream programming language, has attracted major software companies to pour money and human effort into it. As a consequence, a lot of good integrated development environments (IDEs) are out there. Of course, there is nothing wrong with being a real programmer by using the good old BSD vi plus a shell. Or (a little more civilized), emacs. Others who would like a more modern IDE will certainly enjoy eclipse. The main strength of eclipse, besides being totally free, is that it's built in with unit testing (JUnit) and refactoring, the two ingredients of so-called agile programming or extreme programming. It also integrates version control seamlessly. It's much more usable than the MS Visual Studio and VSS dual in nearly every aspect, except of course for generating standard-incompliant Java code.

– Christopher M. Brown, from http://www.cs.rochester.edu/~brown/242/assts/bh_learning/learning.html.

Medians and Quartiles

A number of years ago I build this little app called EventLog.  It's a really simple data journal: you enter a set of a tags and a timestamp (defaulting to "now"), and it saves it off to a database.  Then you can build all kinds of reports and such based on your data to help you track "stuff".  As an example, here's a chart about my benadryl (which helps my skin enormously) consumption since mid-July (click it for the full report):

All kinds of neat, you might say.

One of the other reports that you can access if you're logged in (the report above was made publicly available) is called a hiatus reports, which instead of reporting on the events themselves, reports on the length of the hiatuses between events.  For example, I aim to have  a benadryl every 8 hours or so, so monitoring my hiatuses is more useful than the actual data points themselves.  However, the raw hiatuses aren't terribly interesting as the number of events goes up — you want stats on them (average, median, deviation, etc.) so I added that this evening.

And then Kim (aka Dr. Repp, microbiologist and bio-statistician) wanted quartiles as well as median.

Quartiles are kind of a pain.  It's not obvious how exactly they are computed.  Median is easy, just line up your points in ascending order and take either the middle one (if there is one), or average the middle two (if there isn't).  Quartiles are not the same algorithm applied to the 25% and 75% points (the 50% quartile is the median, of course).  Rather, they are the medians of the two halves of the data on either side of the median.  In particular, the median value (is there is one) is not part of either half; it's the pivot.  Think quicksort.

An example will make this more clear.  Consider this set of values (already sorted):

[1, 2, 3, 4, 5, 6, 7, 8, 9]

The median is obviously five, which leaves four points on either side of it which will be used to compute the quartiles.  So the 25% quartile is the median of this subset:

[1, 2, 3, 4]

The 75% quartile is the median of this subset:

[6, 7, 8, 9]

Note that five isn't in either one.  These subsets yield medians of 2.5 and 7.5, which correspond to the 25% and 75% quartile values.

Now consider this set of values (also already sorted):

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Here the median in 5.5 (the average of 5 and 6), and as that isn't an actual value in the set, the entire set will be represented in the two quartile subsets:

[1, 2, 3, 4, 5]
[6, 7, 8, 9, 10]

This yields quartiles of 3 and 8.

Not a difficult  algorithm, but a bit tricky since you don't actually figure out the quartiles directly on the data, you have to compute the median, split the data, and then figure out the quartiles from the subsets.

Inline JS Progress Bars

2011-05-17: A richer version is available at http://www.barneyb.com/barneyblog/2011/05/17/even-better-inline-progress-bars/.

If you've ever built a web app that does background and/or batch processing of stuff, you've invariably created a bit of markup like this:

#numberFormat(sentEmailCount, ',')# of #numberFormat(emailCount, ',')# sent...

which then renders like this:

Wouldn't it be nice to create this markup (simply with a wrapping span) instead:


#numberFormat(sentEmailCount, ',')# of #numberFormat(emailCount, ',')# sent...

and have it render like this:

You think it would be?  I thought so too, so I wrote a simple little jQuery snippet to do exactly that.  Here she be:

jQuery(".progress").each(function() {
  var $this = jQuery(this)
  var frac = parseFloat($this.attr("rel"))
  if (isNaN(frac)) {
    return // nothing to do
  }
  $this.css({
      display: "inline-block",
      position: "relative",
      border: "1px solid #ccc",
      backgroundColor: "#eee",
      color: "#eee",
      padding: "0 5px"
    })
    .append(''
      + ''
      + '' + $this.html() + ''
    )
})

The basic idea is to take the span, make it render like a bar, hide the text (same color as the background), and then append a pair of absolutely positioned spans to draw the progress bar and then lay the existing content atop it.  It's not terribly robust, but it works like a champ if your needs are simple, and the code is straightforward enough that it should be easy to tweak if you have specific needs.

Use it as is, hack it up, turn it into a massively popular plugin and make a bazillion dollars.  Have fun. News of the gambling industry here. Read more.

Update: here's the same functionality, repackaged into separate JS and CSS so it's easier to deal with. JS first:

jQuery(".progress").each(function() {
  var $this = jQuery(this)
  var frac = parseFloat($this.attr("rel"))
  if (isNaN(frac) || frac > 1) {
    return // nothing to do
  }
  $this.addClass("container")
    .append(''
      + ''
      + '' + $this.html() + ''
    )
})

And the CSS:

.progress.container {
  display: inline-block;
  position: relative;
  border: 1px solid #ccc;
  background-color: #eee;
  color: #eee;
  padding: 0 5px;
}
.progress.container .progress-bar {
  display: inline-block;
  position: absolute;
  z-index: 0;
  top: 0;
  left: 0;
  height: 100%;
  background-color: #dfd;
  border-right: 1px solid #6c6;
}
.progress.container .status-text {
  display: inline-block;
  position: absolute;
  z-index: 1;
  top: 0;
  left: 5px;
  color: #000;
}

Minor AmazonS3.cfc Bug Fix

Today I identified a subtle bug with the listObjects method of AmazonS3.cfc dealing with delimiters.  If you supply a prefix that ends with a trailing delimiter, certain paths would be returned partially truncated.  Removing the trailing delimiter solves the issue, so there's an easy workaround, but I've added a snippet to take care of that if you inadvertantly pass one in.  The patched CFC is available here.  You can always get the latest version on the project page.

Scheduled Downtime This Evening

Just a heads up that the server hosting barneyb.com and all it's various offspring (PotD, EventLog, etc.) will be going down about eight this evening to replace a faulty cooling fan.  Total outage should be less than 15 minutes, and it will be a complete outage (the IPs will be dead).

On-The-Fly YUI Compressor

A couple years ago I wrote about using YUI Compressor to do built-time aggregation and compression of static assets.  That works well and good if you have a build environment, but that's not always the case.  So, still using YUI Compressor, I set up a simple script that'll do runtime aggregation and compression of assets using my favorite mod_rewrite-based file caching mechanism.

The basic idea is that your HTML includes a reference to "agg_standard.js", which is an alias for whatever JS files you need for your standard user (as opposed to a mobile user, for example).  That request comes to your server and if the file exists, gets served back like any other JS file.  If it doesn't exist, however, mod_rewrite will forward it to a CFM page to generate it:

RewriteCond     %{REQUEST_FILENAME}     !-s
RewriteRule     (/my_app)/static/(agg_.*)\.js$ $1/aggregator.cfm?name=$2.js

In our example, the request passed to aggregator.cfm would have "agg_standard.js" as the 'name' attribute, which is how we'll figure out what we need to aggregate together:

switch (url.name) {
case "agg_standard.js":
  files = [
    "jquery/jquery" & (request.isProduction ? ".min" : ""),
    "jquery/jquery.ui" & (request.isProduction ? ".min" : ""),
    "yui/yahoo-min",
    "yui/event-min",
    "yui/history-min",
    "util",
    "jscalendar/calendar",
    "jscalendar/lang/calendar-en",
    "jscalendar/calendar-setup",
    "dracula/raphael-min",
    "dracula/graffle",
    "dracula/graph"
  ];
  break;
case "agg_iphone.js":
  files = [
    "jquery/jquery" & (request.isProduction ? ".min" : ""),
    "yui/yahoo-min",
    "yui/event-min",
    "yui/history-min",
    "util"
  ];
  break;
}

The important bits, of course, is the Groovy script that actually does the aggregation and compression.  It uses YUI Compressor, so you'll need to have the yiucompressor-x.y.z.jar file on your classpath (probably in /WEB-INF/lib).  Here it is:

import com.yahoo.platform.yui.compressor.*

sw = new StringWriter()
variables.files.each {
  def f = new File(variables.STATIC_DIR + it + '.js')
  sw.append('/* ').append(f.name).append(' */\n')
  if (! it.endsWith(".min") && ! it.endsWith("-min")) {
    def compressor = new JavaScriptCompressor(f.newReader(), null)
    compressor.compress(sw, -1, false, false, false, false)
  } else {
    sw.append(f.text.trim())
  }
  sw.append('\n')
}
variables.buffer = sw.toString()

Pretty straightforward: it just loops over the files, using a StringWriter to build up the aggregated buffer.  Each file is either compressed into the Writer or simply appended based on whether the file has already been minified (based on ".min" or "-min" in the filename).  Each file also gets a comment label in the Writer above it's contents so that the aggregated file is a little easier to parse (at the expense of a few extra bytes).  Once done, the Writer's contents are stored in the 'buffer' variable caching on the filesystem:

<cfset fileWrite(STATIC_DIR & url.name, buffer) />
<cflocation url="#url.name#" addtoken="false" />

You'll notice that I'm not streaming the buffer back out to the user, but instead 302-ing back to the same URL.  This is important.  The reason is that Apache does a whole bunch of stuff to optimize static assets, and if I serve the content back with CFCONTENT, I'll miss out on all of that.  Yes, the 302 has a little bit of overhead on the initial page load, but it reduces the total transfer size by several hundred KB (because of the GZIPping), and avoids a rerequest on the next page load (because of the cache headers).  So it's completely worth it, the moreso because this is an application likely to generate extended usage rather than a content-centric site that is likely to see single-page "bounce" visits from search engines.

The last piece of the puzzle is handling versioning of your assets.  When you change your JS file, you necessarily have to invalidate your cache (by deleting the files) so the aggregated version can be rebuilt with the new JS.  The easiest way to do that is to use psuedo-versioning of your assets.  You'll see a lot of sites will add a timestamp or a version number to their files (e.g., "arc/yui/reset_2.6.5.css" from Yahoo.com) so that when the update the file it gets a new filename, and is therefore redownloaded by everyone (because it doesn't exist in their cache).  That's great, but it means you have to rename your files all the time which is kind of a pain.  But you can fake it:

<cfset STATIC_ASSET_VERSION = 15 />
<script type="text/javascript" src="static/agg_standard_#STATIC_ASSET_VERSION#.js"></script>

That'll generate a request to "agg_standard_15.js", as you might imagine, which isn't going to work so well.  But we can just change the 'switch' line from the first snippet to this:

switch (REReplace(url.name, "^(agg_.*?)(_[0-9]+)?\.js$", "\1.js")) {

Now it'll strip out that "fake" number string and switch on just "agg_standard.js", which is what we want.  But that 'fileWrite' call later will still use the full filename (with the number embedded).  That way subsequents will get the filesystem cacheing, the headers and GZIPping from Apache, and all the other love.  And when you rev your files, you need only increment the STATIC_ASSET_VERSION variable and you'll have a brand new set of virtual URLs for all your assets, no fuss, no muss.

Oh, and just in case you're wondering, the aggregation and compression is fast.  If you've ever used the command line or Ant task, you might fear that it's slow, but most of the time you see there is from the JVM spinning up, not the actual compression.  Since this is all in-JVM, you don't pay any of that cost.  It's certainly not fast enough to have it run in production on every request (hence the file caching), but it's totally reasonable to do on your production box as part of deploying a new version of the app.  It's also probably fast enough to have running every request on your internal test/staging boxes, though that'll depend on how much you're aggregating/compressing among other things.

Flash Scope CFC

If you've ever used Grails, you probably know about the 'flash' scope that it provides for passing data from one request to the next.  Certainly not a must-have feature, but it's quite handy.  The typical use case is after a form submit, you set stuff into the flash scope and then redirect to some page where you use the flash scope's contents to spit a message back to the user.  For example, after editing a user and submitting, you'd redirect the browser to the user listing page and display a "Successfully updated user" message at the top of the page.

The benefit of using the flash scope for doing this is that it only lasts one request, so if the user refreshes the listing page, they won't see the success message again (nor should they, since a user wasn't updated on the refresh).  It also keeps your URLs clean (because you don't have to pass stuff along on the query string), and lets you pass complex data (since there is no serialization).

I'm sure a lot of people have faked this functionality in some sort of use-specific way, and I know I'm guilty.  However, after preparing to do it yet again, I thought I'd build a more generic mechanism that I could reuse.  Before I give you the actual code, here's how you might use it:

<cfcase value="onRequestStart">
  <cfset flashScope = createObject("component", "FlashScope") />
</cfcase>

<cfcase value="updateGoal">
  <cfset xfa.success = "goalList" />
  <cfset goalService = request.beanFactory.getBean("goalService") />
  <cfset goalService.updateGoal(
    session.user.getId(),
    attributes.id,
    attributes.name,
    attributes.definition
  ) />
  <cfset flashScope.put("message", "Goal Updated Successfully!") />
  <cfset location("#self##xfa.success#") />
</cfcase>

<cfcase value="goalList">
  <cfset goalService = request.beanFactory.getBean("goalService") />
  <cfset goalList = goalService.getGoals(session.user.getId()) />
  <cfif flashScope.has("message")>
    <cfset statusMessage = flashScope.get("message") />
  </cfif>
  <cfset include("dsp_goallist", "bodyContent") />
  <cfset do("lay_auto") />
</cfcase>

As you can see, we're instantiating the flash scope at the start of every request.  Then in 'updateGoal', the 'message' key is set into the flash scope.  Finally in 'goalList', if there's a 'message' in the flash scope, it is set into a local variable (to be emitted within dsp_goallist.cfm).  Pretty straightforward.  The actual mechanism is a pair of structures: one in the request scope for incoming variables and one in the session scope for outgoing variables.  The FlashScope CFC is nothing more than a facade to those two structures.  This implies that you must have the session scope available to your application, of course.

I haven't created a full-on project for this but it garners sufficient interest and development activity, I'll certainly do that.  In the meantime you can view/download the CFC (or if you're using Subversion, svn:externals it into your project).

In the interest of full disclosure, I want to call out two shortcomings in the implementation, both of which are intentional (to minimize complexity):

  1. There is no respect paid to dealing with concurrent requests.  They'll happily intermix their variables in the flash scope, potentially causing errors and/or bizarre behaviour.
  2. The scope isn't really tied to the 'next' request, it's tied to the 'next request that reads the flash scope'.  This can yield behaviour if your app assumes that the flash scope will be consumed, but it isn't for whatever reason.

Both of these would require some sort of nonce on every request, which adds a whole layer of complexity and necessitates passing some sort of request parameter on the URL or whatever.  That's a) a mess, b) complicated, and c) application-specific.  As such, I've opted to intentionally not handle those cases since they're edge cases and I'd rather keep things simple.  That and I have this inexplicable obsession with 80-100 line, single-file microframeworks (see FB3lite, CFGroovy, TransactionAdvice, etc.).  :)

Closures, Closures, Closures

Guess what time it is, kids!!

It's "Barney still wants CFML closures" time!  Yay!

Today's impetus is Edmund, Sean Corfield's event driven programming framework.  In order to register event listeners, you have to a CFC instance with a specific method to be invoked on it, and which accepts an Edmund Event as it's sole argument.  Which means you have to have these silly little CFCs hanging about that simply get the event and then hand it off to the appropriate business components to actually do stuff.  Yes, I understand that's a very OO way to do it: lots of little, purpose-specific types passing messages between them.  But it's a bitch with CFML because every type has to be it's own file and you don't get context inheritance.

In Java most of the time your event listeners are anonymous inner classes – instances of classes that are defined inline.  That mechanism gets a lot of grief, and while I agree that it's a little more verbose than necessary in simple cases, it's a lot better than the crap that static typing and/or checked exceptions foists on you, and having a full class definition can be useful as things get more complex.

Groovy smoothes that a little by allowing you to define Closures and use them instead.  Under the hood, it's just converting them to your standard Java anonymous inner classes, but the syntax is rather nicer.  If you know Ruby, think lambdas.

This is what I'd do in Java:

edmund.register("eventCreated", new GenericHandler() {
  public handleEvent(Event e) {
    beanFactory.getBean("eventservice").postprocessEvent(e.value("eventId"));
  }
});

or in Groovy:

edmund.register("eventCreated", {
  beanFactory.getBean("eventservice").postprocessEvent(e.value('eventId'))
})

But in CFML, I need to create a separate type (in a separate file):

<cfcomponent>

  <cffunction name="init">
    <cfargument name="beanFactory" />
    <cfset variables.beanFactory = arguments.beanFactory />
  </cffunction>

  <cffunction name="handleEvent">
    <cfargument name="e" />
    <cfset beanFactory.getBean("eventservice").postprocessEvent(e.value("eventId")) />
  </cffunction>

</cfcomponent>

And then register it like this:

<cfset edmund.register("eventCreated",
  createObject("component", "MyEventHandler").init(beanFactory)
) />

What a mess.  Not only do I have to create a separate file to have that one single line of code in it (line 10), but I also have to worry about passing context around (in this case, the beanFactory), because the CFC instance doesn't inherit the context it's instantiated in (as closures and anonymous inner types do).  And this is a ridiculously trivial example.  Here's what I'd like to see in CFML:

edmund.register("eventCreated", function(e) {
  beanFactory.getBean("eventservice").postprocessEvent(e.value("eventId"));
});

You can see that I'm following the ECMAScript-like nature of CFML expressions and stealing ECMAScript's function literal syntax (one of them, at least) to create an anonymous function.  In order for this to work, it'd have to be a closure (as ECMAScript functions are), not just a context-free function.  As an alternative (which is more complicated, but which has certain advantages in certain scenarios), would be to have a CFC literal (anonymous inner CFC) as well:

edmund.register("eventCreated", new component() {
  function handleEvent(e) {
    beanFactory.getBean("eventservice").postprocessEvent(e.value("eventId"));
  }
});

Personally, I'd much prefer the closure if I only got one, but both would be nice.  The semantics of anonymous inner types can be a little wonky, and the advantages over simple closures is small, but they can still be really useful.  Now that we have a CFSCRIPT-based way of defining components, the language at least has syntactic constructs to express anonymous types, so it's theoretically possible.

So since I can't do all this neat stuff, you might ask what I did do.  I used ColdSpring with a purpose-sepecific adapter I wrote, along with a custom extension to Edmund to allow registering listeners from ColdSpring.  So my Edmund config looks like this:

<bean id="edmund">
  <constructor-arg name="asyncByDefault"><value>false</value></constructor-arg>
  <property name="eventListeners">
    <map>
      <entry key="eventCreated">
        <bean class="edmund.framework.ColdSpringListener">
          <constructor-arg name="handler"><ref bean="eventservice" /></constructor-arg>
          <constructor-arg name="method"><value>postprocessEvent</value></constructor-arg>
        </bean>
     </entry>
  </property>
</bean>

The ColdSpringListener bean simply takes care of delegating the configured method to the supplied bean when it is triggered, passing along the event arguments along.  This obviously is still pretty verbose, but the per-listener overhead is four lines of XML (in an existing file), not a 13 line CFC (in a new file).  That counts as a win by me.  The way I extended Edmund allowed passing a single listener or an array of listeners.  The method I added is here:

<cffunction name="setEventListeners" returntype="any" access="public" output="false"
  hint="I register multiple new listeners at once.  I DO NOT REPLACE listeners,
    despite being a setter.  I should be named addMultipleEventListeners, but must
    be named setEventListeners so I can be used from ColdSpring.">
  <cfargument name="listeners" type="struct" required="true"
    hint="I am a struct with event names as keys and either a single listener or
      array of listeners as values.  Note that there is no way to specify the
      handler method or whether listeners are asynchronous." />
  <cfset var e = "" />
  <cfset var i = "" />
  <cfloop collection="#listeners#" item="e">
    <cfif NOT isArray(listeners[e])>
      <cfset i = listeners[e] />
      <cfset listeners[e] = arrayNew(1) />
      <cfset arrayAppend(listeners[e], i) />
    </cfif>
    <cfloop from="1" to="#arrayLen(listeners[e])#" index="i">
      <cfset register(e, listeners[e][i]) />
    </cfloop>
  </cfloop>
  <cfreturn this />
</cffunction>

So is this good stuff?  It's OK.  It's solid for a CFML solution.  But there is no question that it pales in comparison to the elegance with which you could solve the problem in other languages.  Java (same age as CFML) has this, Ruby (a little older than CFML) has this, Python (much older than CFML) has this, and don't even get me started on Lisp (more than double the age of CFML).  It's not a concept new to computer science.

Alright.  Rant over.  Until next time I seriously consider rebuild huge swaths of CFML in Groovy just for closures and start yelling again.