Archive for the 'development' Category

First Person Documentation

I'm not sure when I started, but I've documented things in the first person for quite a while.  Fusedocs promoted this format, and was probably a significant influence, though I recall doing it back in college as well.  It's clearly not new or uncommon, but I just had a gentleman email me about it (based on finding my old ComboBox script embedded in an app he inherited), and his comments really brought to light why I like the format so much.  Here's an example (from the ComboBox code):

/**
 * I am called to repopulate the dropdown.  There should never be a
 * need to invoke me externally.
 */
ComboBox.prototype.populateDropdown = function() {
  // ...
}

Now you might say that there is no substantial difference between that and the more typical third-persion documentation.  However, I think there's a huge difference.

First, in order to anthropomorphize the method enough to write in first person, you have to get inside what it's doing.  That provides useful insight, and can highlight behaviour that really doesn't make that much sense, even though it doesn't jump out and say "I'm wrong!".  It also makes the comments easier to write, I think, because it's more personal.

Second, writing in the first person lets you make certain statements in a far nicer way.  For example, there's no way to say "Barney implemented this method poorly", despite the fact that it might be true for whatever reason (misunderstandings, time constraints, etc.).  However, saying "I am implemented poorly " doesn't assign blame.  Even better is "I was poorly implemented by Barney", which would never be written by anyone except Barney, and therefore also prevents assignment of blame, but does let Barney take it upon himself.  Hopefully any of the three are immediately followed by how it was implemented poorly, why it was done that way, and even suggestions for improvement.

The inverse of this is it lends a bit of personality to the code.  Seemingly arbitrary business requirements tend to end up with comments that have a bit of "attitude" to them, for example.  This personality is absolutely instilled by the developer(s) themselves, but it becomes an element of the code itself which makes it far easier to deal with.  It supplies a sort of contextual memory for the code when it's cracked open again in a different time.

Finally, the mindset of the reader (even if it's the same person who wrote it) is different when interfacing with something first hand.  The first person gives you the impression that the code is alive and talking to you, and as we all know, code is a living thing.  The third persion is like what you'd see in a museum; information about a static snapshot of something dead.

I'd be quite interested to hear about other's experiences with this style of commenting, good or bad.

Shoot the Engineers

About a week ago, Marc Funaro wrote an interesting blog post about CFML and OO.  The prevailing opinion (via Twitter, blogs, etc) is that Marc is incorrect/inaccurate/inexperienced/whatever, and I disagree completely.  He hit the nail on the head.

HTTP is a stateless, request-response environment.  Nearly all web applications interface with a SQL database, which is also a predominantly stateless request-response environment.  Those are orthogonal to the core OO principle of interacting stateful objects.  It's far closer to the FP (functional programming) paradigm, but particularly on the SQL side, still doesn't match completely.

To use OO in a SQL-backed web app, you hide the mismatches with ORM, an object-based Front Controller implementation, and session facades.  As Marc points out, Java works pretty well in this paradigm for two main reasons:  Java is crazy fast, and Java developers have invested ridiculous amounts of effort in tooling to support this model.  CF has neither of these advantages.  I'm not belittling the effort poured into various frameworks (Fusebox, Model-Glue, ColdSpring, Transfer, etc.), just that they are significantly behind what is available to Java developers.

Unlike Marc, I happen to think that a Front Controller framework is essential, but I don't use a OO one for exactly the reasons he outlines.  I build FB3lite for just this purpose: 70 lines of straightforward procedural code that help me enormously with certain common tasks.  I often masquerade my apps as standalone pages with mod_rewrite (converting /viewUser.html into /index.cfm?do=viewUser), but that's a cheat.

I also use CFCs  and ColdSpring for my business tier, but no object (domain) model for me.  The CFCs are really just glorified function libraries that I can use ColdSpring's AOP engine to wrap transactions around without having to manage them explicitly in my code.  In order to get the AOP I have to use CFCs, and I like the namespacing they provide (so I can have a 'doThing' method in multiple namespaces without conflict), but there is no real OO-ness there.

I know what you're saying.

Yes, I often preach the benefits of OO and encourage people to learn about it and use it.  But using a howitzer to hunt mice in your garage is not a clever idea.  If I'm writing Java (or Groovy), I'm going to use OO structures, but that's because of the programming environment.  I am a pragmatic person.  I like to learn about a wide array of tools and then use as few of them as possible, knowing that there are other options available if I need them.

Yes, built CFGroovy with Hibernate support so I could use ORM in my CFML apps via Groovy objects.  It provides the best of both worlds, the speed and tooling of Java in a CFML environment.  That approach works quite well, but if I don't need the complexity, I'm not going to do it.

My First cf.objective()

I know I'm late to the "cf.objective() recap" party, but I've been both crazy busy and rather tired, so I haven't got to it until now.

First, I'd never been to Minneapolis before, and from the little I saw, it's a pretty nice place.  Obviously I missed the "buried under snow" part, and that definitely puts a damper it as a potential home, but I liked it.  Very walkable, clean, and aside from the second-story causeways between the buildings, a nice asthetic overall.  The hotel was in a great spot, with a pretty varied selection of dining an drinking establishments within easy walking distance.

Before I got there, I hadn't quite internalized how small a 200-person conference actually is.  "Social" is a skill I didn't inherit from my father, unfortunately, but with the number of people I knew already, I didn't feel nearly as isolated as I often do at CFUNITED (which is five times the size).

The sessions were pretty good, over all.  I didn't get to go to several that I would have liked to because of scheduling, but c'est la vie.  Here's a rundown of the notable ones I attended:

Adobe's keynote the first day was interesting, and I might be mixing it in with some other Adobe presentations, but quite fascinating to see crowd reaction to certain Centaur features.  CFFINALLY and CFCONTINUE?  Nothing.  It should be noted that I was the only one to applaud them last year at CFUNITED.  Remote diff of server configuration?  Huge applause.  WTF?!?  Script your production environments, people.  If they're ever out of sync, you're doing your job wrong.   ORM stuff got much applause, of course, and rightfully so.  Drag and drop, full-stack scaffolding also did.  Do people actually use that?  Great marketing/sales tool, no question, but for actual applications?!?!  But I digress…

Marc Escher's talk on unit testing was quite interesting, I thought.  I've tried numerous times, with numerous technologies, to really embrace unit testing and failed every time.  Actually had the best luck doing it with Flex, which just drips with irony.  I'm not predicting success next time I attempt it, but I'm confident I'll do better than last time.  On a similar vein, Sean Corfield's talk on cf.spec provided some nice pointers.  I'm not too sure about the "readable" spec document concept, but an interesting technique.  Until you can have exactly one spec document, I'm not sure of the utility, but I think that's really an editor/syntax problem, not a conceptual one.

Mark Mandel's intro to Transfer was quite interesting as well.  That I attended might surprise you if you're familiar with the various Hibernate projects I've worked on, but ORM is still voodoo in my mind.  Coming back to the basics and being "introduced" to ORM from the ground up is always interesting, because the subtleties in interpretation provide a great introspective of ORM as a whole.  The odds of me picking up Transfer and using it on a "real" project are pretty small, but I didn't go to learn about Transfer in particular, more about ORM in general.

Let me be clear on this, Transfer is amazing.  It does things with CFML that I would have sworn were impossible, and does them fast enough to be perfectly servicable.  It's just not the tool that fits my style.  I've been a Hibernate user for many years, and that's a hard framework to supercede.  Honestly, I bet I'll never replace Hibernate with another ORM solution, but instead replace it with an alternate approach (an object database, for example).

As you might expect, I also went to Adobe's talks about the new ORM functionality coming in Centaur.  When I first was exposed to their Hibernate implementation, I was pretty skeptical.  There seemed to be a global misunderstanding of both the technology and the problem it was design to solve, but that has turned around 180 degrees, and Centaur looks to have pretty robust ORM capabilities.  I've got a major bone to pick with how Adobe is marketing the functionality, but the actual implementation looks pretty sound.  It's hard to get a complete picture with the pre-release secrecy, but I'm a lot more excited about it than I was 6 months ago.

Even more exciting is what will hopefully be coming out of Railo/JBoss in the coming months.  There's been no formal talk of what that looks like yet, and it's probably a safe bet that it'll be similar to ColdFusion's implementation (for obvious reasons), but with Railo and Hibernate both under the JBoss umbrella, I think there's some cool stuff on the way.  Obviously any speculation is just that, but with Railo supplanting ColdFusion in a lot of places I use CFML, I'm understandably excited about it.

The last session was Adam Haskell's talk on mentoring and code review.  That is a misnomer of a title, if you ask me, because while he did talk about that stuff, the point was really about team dynamics.  Working on a team is hard.  Working in a "get things done" environment only makes it worse.  Fostering the team, particularly around helping junior developers move up in the world, takes time and effort, but it's worth it.  I think Adam did a good job of emphasising that any sort of formal process is less effective than an equivalent informal process.  Informal is inherently more personal, and with the typically sterile world of technology and computers, the "personal" stuff is really important.

Of course, the big draw of any conference (though the hardest to justify) is the meals/drinks/etc. that happen outside the actual conference.  (It seems like I just said the same thing two sentences in a row, completely by accident.)  With the conference as small as it was, the informal socialization was a lot tighter, I though.  Far less spreading of groups, and so more churn within them.

I also really liked the way they did lunch, with actual table service, rather than a buffet.  With a more formal meal, you end up sitting and talking with fewer people, but for a longer period of time.  Aside from fostering more involved conversation, it also provides a nice break from the chaos to resettle for the afternoon.

Talking with other developers is always fun, and typically the source of the best tidbits of information.  I always like learning about stuff, even if it has no direct applicability, because it gets you mind thinking in ways it otherwise wouldn't.  And it seems to happen pretty often that 6 months down the road one of those random bits of information suddenly because fairly relevant.  Maybe not directly, but at least opens my eyes to some potential approach I wouldn't have otherwise considered.

Great conference, overall.  I gotta hand it to Jared and his team.

Effective Photo Manipulation

Image manipulation is a common tasks for web applications, usually centered around creating and managing thumbnails (of photos, PDFs, videos, whatever).  Photo manipulation is a subset of image manipulation, and has a couple aspects that differentiate it other types.

First and foremost, photo quality is of high importance.  Contrast this with creating a thumbnail of a PDF (especially a text-heavy one); there's no way you can be representative of the original's detail.  A photo's thumbnail, however, must be as representative as possible.  Unfortunately, photos are typically encoded as JPG files, which use a lossy compression algorithm, which means that every time you manipulate them in any way, you're reducing the quality.

Photos are also often subject to user-editing (cropping, annotating, etc.), and while this is commonly done with dedicated photo editing tools (installed on the user's computer), plenty of web apps do it too.  I'm not thinking Photoshop Web, here, I'm talking about your site's photo gallery software that lets you upload originals and rotate them so they're facing the right way.

Between these two things, you (the developer) can get stuck in a bind.  You need to be able to edit photos repeatedly, but at the same time maintain quality.  The trick is to only ever edit a photo once.  Seriously.

This is simpler than it sounds.  The trick is to start thinking about designing your app interms of operations (rotating) instead of states (rotated).

When you get your hands on the original image, put is somewhere safe.  That's the only image you're ever going to deal with.  When your user comes and rotates it, write the specifics of that operation to the database, and then take your original and run all the stored operations on it to create the "current" view.  Then when they crop it, write the operation to the database, and then take your original and run all the stored operations on it to create the "current" view.  You can see where I'm going here.

Same thing goes for thumbnails.  If you need a 150×150 thumbnail of a phote, don't take the "current" view and resize it, take the original, run all the stored operations on it, and tack on a "resize to 150×150″ at the end to create the thumbnail.

This works because you only run through the lossy compression once, so your generated views stay at a noticably higher quality than if you save after each operation.  It might not be apparenty in a 50×50 thumbnail, but if with two operations (rotate then crop) on a full-size image, the difference in quality by doing them together verses encoding in between is quite apparent.

Of course, you can't do this sort of processing on demand, so you have to store the results of the transformations.  The key is to never read those in to do the next step, always start from the original and play back all the manipulations in sequence.  This mechanism fits very nicely with the mod_rewrite-based caching mechanism I wrote about a couple days ago.  When a new operation is saved, wipe out all the cached versions of the photo and let them rebuild as needed.

This isn't a particularly revolutionary idea, and I know many photo tools (e.g. Picasa) do things this way, but I've found that it's worth the extra effort.  Obviously it's utility is predicated on having decent originals, but believe me, your users will notice.

Functional Programming Languages

Ever done functional programming?  Chances are you'll say "no", but you'll probably be wrong.  Javascript is a functional language, and while a lot of people use it in a procedural and/or object oriented way (\me raises hand), it's foundation is functional.  Same deal with ActionScript.  Used Groovy?  Ruby?  Python?  None are functional (let alone pure), but all have significant functional aspects.

So what is a purely functional language?  In a nutshell, it's a language that doesn't have the concept of mutability.  Things never change in a functional environment, all that happens is that new things are derived from old things.  Here's an example, which should be familiar to anyone who has done Groovy/Ruby/Python or used jQuery/Prototype:

nameArray = userObjectArray.pluckField("firstName").sort()

No mutation.  Only derivation.  It's like magic.  What's happening is the pluckField method is creating a new array with first names, and then the sort method is creating a new array in sorted order and that third array is what is set to the 'nameArray' variable.

This is ridiculously powerful, because it eliminates scoping issues from the mix.  There is no place a race condition can crop up anywhere, because everything is immutable.  What does that mean to you the developer?  It means the end of worrying about thread safety.  No more locks or scoping issues (e.g. 'var' in CFML).  It also results in really readable code.  Start at the left, read every word, and if you have decent function names it's pretty obvious what happens.

Of course, sometimes change is good, so most functional languages have the concept of mutability.  The pure ones are typically reserved for the math nerds, since algorithms in a pure functional language can actually be "proven" just like a mathematical proof.

I don't really know what my point is, except that CFML continually pisses me of with it's lack of any type of functional nature.  It sort of has higher order functions, but since it lacks closures, they're of minimal utility.  Even currying would give them some basic helpfulness, though certainly a kludge.

I was writing a simple Google Charts wrapper (to replace an SVG/Batik engine) on an app that doesn't have Groovy available to it and it's such a friggin' pain.  It's almost enough to make me want to go build a CFGroovy Lite that I can cheat into place more easily than the full framework.  Not that Groovy is even close to purely functional, but it's easy to use as if it is.

If only Clojure wasn't Lisp; I don't have that many parentheses.  A JVM-based functional language with a C/Java-style syntax would be truly excellent.

Show Me Your Tool

If you read my blog regularly, chances are you write software and therefore can't, because your tools don't exist in the visual world.  They're just magic strings of minuscule magnets on a rapidly spinning chunk of plastic…

I took my chef's knife to the sharpener a few days ago.  Cost a whopping $4 to have him put a wicked edge on it, and I watched it happen.  I saw him carefully run the blade along a belt sander (for lack of a better term) a few times to give it the rough shape, then he used a bench grinder to finish the edge, a steel to hone it, and finally a jeweler's wheel to polish it.  Not five minutes elapsed before he handed it back to me, wrapped in butcher's paper for the journey home.

If he let me loose in his shop, there's no way I could have achieved the same result.  But given ten knives, I bet I could get a pretty good edge on the last few (after undoubtedly destroying the first couple).  Nothing like his result, to be sure, but significantly sharper than the initial state.

Sharpening a knife is a pretty simple task, because a knife is an inherently simple item, but it's just one example.  Consider a master furniture maker.  He can take the same wood you and I buy at Home Depot or Lowe's and with his tools and expertise turn it into a beautiful bureau or armiore.  Turn me loose in his shop and I'd probably be able to make a functional dresser in twice the time it'd take him to make an exquisite one.  With some more experience, both using the tools and in furniture construction overall, I've no doubt I could make something I'd be proud to have in my home.  It wouldn't be the same quality as something the master craftsman created, no question, but better than the prefab stuff you might otherwise buy.

So what's special about these tasks?  Nothing, really.  Most things are of a similar nature: cooking, playing music, grooming dogs, surfing, etc.  Attaining mastery of a given profession requires certain in-born characteristics, but attaining laudable proficiency is pretty much available to anyone willing to put in the time (barring physical disabilities and such).

Every chest of drawers provides a way to organize and store clothes.  People spend a lot of money on well made dressers that are made of pretty woods, appeal to their personal tastes (mission, contemporary, etc.), or are simply of a higher quality of manufacture.  None of which has the least to do with holding clothes.  Every dresser I've ever seen holds clothes with about equal proficiency, but even though the drawers are a bit sticky, I still use the one I had as a child.

Now consider software development.  The tools are invisible.  The process is invisible.  The result is intangible.  As far as a profession for a craftsman goes, programmers are fucked.  Sure, we get paid because people are willing to pay for the benefits of our software, but it's 100% functional.  No one buys software because it's well made or "pretty".  They might pick between two vendors because one is less error-prone, but that's still functional.

Every database application provides a way to organize and store data.  No one spends extra money on a database system because it was made with snazzy buttons, appeals to their sense of style, or was produced by a higher quality process.  Every database system I've used is inconvenient in one way or another (no OFFSET, no CTEs, etc.), and every one is built using some completely opaque process by unknown automatons in some office building somewhere.

It has occurred to me that the reason for this could simply be that programming is so damned hard it can't be automated.  As a result, there's no way to produce the gradations of craftsmanship that you see in dressers (from the mass-produced pressboard affairs to the hand-crafted hardwood masterpieces).  With software all you get are the hand-crafted versions.  Sure, some of them are simply horrible, but they're all hand-crafted.

Which brings me back to the point: programming is opaque for everyone that isn't also a programmer.  There's absolutely no way you can take your average Joe, sit him next to you while you write something, and then give him your workstation and have him do the same.  Unlike the furniture maker where a simple demo is enough to get the gist of what is happening, with software it's all abstract and divorced from anything tangible.  My mom (who is fairly technically adept) doesn't have any idea what the hell Subversion is, and even if I sufficiently explained it, there's no way she would understand how massively beneficial vendor branches are.  Heck, a lot of programmers don't understand vendor branches.  And yet a one-year old can run her fingers over a piece of wood and tell you if you need to keep sanding (if not run it through the planer again).

Further, there's absolutely no way Joe (or my mom) can look at two pieces of software and compare their "quality" on any meaningful level.  He can make distinctions like "this one crashes more", or "that one has confusing icons", but that's it.   Even a competent programmer looking at a piece of software has the internals almost completely hidden from them.  Very careful observation can provide certain clues (the query optimizer must be making decision X based on inputs A, B and C), but by and large, everything is opaque.  Again, contrast this with a fine bureau where you can see the carefully planed wood, the perfectly matched dovetail joints on the drawers, and the complete lack of any visible fasteners.

I certainly consider myself a craftsman.  I hope to justify calling myself a master someday, but today is not that day.  And yet every evening, as I'm walking out the front door to head home, I think to myself at how completely impossible my work is to appreciate.  My kids ask what I did today, and I have no meaningful answer to give.  The best I can do is "fixed some bugs", or "had an architecture meeting".  I can't explain to my non-programmer friends what I do, or why it has such appeal.  I live a dual life: a "normal" one and a programmer one.  They are as compatible as fire and ice.  I greatly enjoy the praise and criticism I receive from my peers regarding stuff I share, especially when it helps make others' lives easier, but I'd trade it all to bring something home from work one day, show it to Lindsay and Emery, and have them say "Wow, Daddy, that's amazing."

HTTP is an API

Ray Camden posted an interesting article over on InsideRIA about expanding short urls using jQuery and ColdFusion.  After reading the article, I thought he was overcomplicating things somewhat by relying on the url shortening services' APIs to do the lookups.  Yes, that's what APIs are for, but for this case, HTTP happens to be a perfectly sufficient API, it's consistent across services, it requires no special interface code, and it's crazy simple.  I commented to that effect, and then decided I ought to put my  money where my mouth is.

To that end, I wrote a small demo app that uses several different bits of tech to get the job done (render an ugly page with auto-expanding URLs).  It uses JSON/P to load some tweets containing shortened URLs from the Twitter Search API, writes them to the DOM with jQuery after wrapping the URLs with A tags (the part Ray did manually), and then wires a jQuery mouseover listener to trigger expansion (virtually identical to Ray's).  Of course, the server side is what I was actually interested in, so here it is:

<cfhttp method="get"
  url="#attributes.expand#"
  redirect="false"
  result="cfhttp" />
<cfoutput>#cfhttp.responseHeader["Location"]#</cfoutput>

Pretty simple, eh?  Both a rapier and a howitzer can kill a man; picking the right one is important.

Why does this work?  Because every service is just doing a HTTP 301 with a Location header from the short URL to the full URL.  Some of them expose APIs for creating and querying the short URLs, but for this task, we don't actually care, we only care about the Location header.  The only special processing require is for preview.tinyurl.com links; I simply convert them to normal tinyurl.com links so they do the "right" thing.

As always, there is full source (all 55 lines of it, in one CFM template) displayed beneath the app itself should you wish to see all the details.  The JavaScript stuff is far more complex than needed for a demo app, but I wanted to do a little experimentation on the client side as well as the server-side stuff.

Spring 2 "scope" goodness for ColdSpring

In Spring 1.2 (and ColdSpring, which emulates it), you have the "singleton" attribute, which was a boolean flag for whether a bean is a singleton (the default) or a prototype (instantiated afresh for every getBean call).  If you've used Spring 2.0+, you've probably come across the "scope" attribute, which supersedes the "singleton" attribute, and allows singleton, prototype, request, and session lifecycles.

In most cases, singleton and prototype are all you need, but it's occasionally useful to scope a bean to a request or a session.  But what does that mean, exactly?

With a prototype bean, you get a new instance back every time you call getBean.  With a singleton bean, you get the same instance back every time you call getBean for the life of the BeanFactory.  With a request bean, you get the same instance back every time you call getBean for the life of the request.  You can probably guess what happens for a session bean.

Since ColdSpring emulates the Spring 1.2 way of doing things, you don't have access to request- or session-scoped beans, unless you manually load them into the appropriate scope and always reference them from there, instead of the BeanFactory.  That's a mess, so I wrote a simple wrapper bean called BeanScopeCache to provide this functionality in bean form.

To use it, you define your target bean as you normally do (ensuring you set singleton="false"):

<bean id="requestConfig_target"
    class="com.barneyb.cache.requestconfig"
    singleton="false">
  <constructor-arg name="contentcache"><ref bean="contentCache" /></constructor-arg>
  <constructor-arg name="publicUrl"><value>${publicUrl}</value></constructor-arg>
</bean>

and then create a BeanScopeCache as such:

<bean id="requestConfig" class="com.mentor.util.BeanScopeCache">
  <constructor-arg name="targetBeanName">
    <value>requestConfig_target</value>
  </constructor-arg>
  <constructor-arg name="scope">
    <value>request</value>
  </constructor-arg>
</bean>

In this case, my target bean is named "contentCacheRequestConfigTarget" and through BeanScopeCache it'll be tied to the request scope.  Note that you can't use a nested bean, because BeanScopeCache needs a bean name not an injected instance.  Here's the source:

<cfcomponent output="false" extends="coldspring.beans.factory.FactoryBean">

  <cfset SCOPE_KEY = "__com_mentor_util_bean_scope_factory_cache_key__" />

  <cffunction name="init" access="public" output="false" returntype="BeanScopeCache">
    <cfargument name="targetBeanName" type="string" required="true" />
    <cfargument name="scope" type="string" required="true" />
    <cfargument name="bindToBeanFactory" type="string" default="false"
      hint="Whether to bind instances to this bean factory.  Ignored for singletons." />
    <cfset variables.targetBeanName = targetBeanName />
    <cfset variables.scope = scope />
    <cfset variables.bindToBeanFactory = bindToBeanFactory />
    <cfreturn this />
  </cffunction>

  <cffunction name="setBeanFactory" access="public" output="false" returntype="void">
    <cfargument name="beanFactory" type="coldspring.beans.BeanFactory" required="true" />
    <cfset variables.beanFactory = beanFactory />
    <cfset INSTANCE_ID = createObject("java", "java.lang.System").identityHashCode(beanFactory) />
  </cffunction>

  <cffunction name="isSingleton" access="public" output="false" returntype="boolean">
    <cfreturn scope EQ "singleton" />
  </cffunction>

  <cffunction name="getObject" access="public" output="false" returntype="any">
    <cfset var key = SCOPE_KEY & targetBeanName />
    <cfset var container = "" />
    <cfif bindToBeanFactory>
      <cfset key &= INSTANCE_ID />
    </cfif>
    <cfswitch expression="#scope#">
      <cfcase value="prototype">
        <cfset container = {} />
      </cfcase>
      <cfcase value="request">
        <cfset container = request />
      </cfcase>
      <cfcase value="session">
        <cfset container = session />
      </cfcase>
      <cfcase value="singleton">
        <cfset container = variables />
      </cfcase>
      <cfdefaultcase>
        <cfthrow type="IllegalArgumentException"
          message="The '#scope#' scope is not supported." />
      </cfdefaultcase>
    </cfswitch>
    <cfif NOT structKeyExists(container, key)>
      <cfset container[key] = beanFactory.getBean(targetBeanName) />
    </cfif>
    <cfreturn container[key] />
  </cffunction>

</cfcomponent>

Project Euler Test Harness

Project Euler is a collection of mathematics/computer science problems, as you probably already know.  I've solved almost 50 of them so far, and I've developd a collection of utilities to make my job easier.  Some of them (factorization routines, prime generators, etc.) I'm not going to share as they are fundamental to solving certain problems.  However, I will share my test harness, as it's probably generally useful, and it contains no "secrets".

Why build a custom harness?  Well, I originally considered using JUnit (or whatever) as a simple runner, but since there isn't anything to test, it didn't feel right.  It also doesn't provide a good way to get timing information for the runs.  And since I was originally using strictly Groovy, I didn't want to impose any more "ceremony" than I needed to.  I've since started doing some problems in Java simply for the performance (as well as converting some of those "secret" routines), and have generalized my harness.

First, here are the file templates for Groovy:

import util.Runner

new Runner() {
  // put some code here, eh?
}

and for Java:

import util.Runner;
import util.java.Solver;

public class BruteForce {
  public static void main(String[] args) {
    new Runner(new Solver() {
      public Object solve() {
        // put some code here, eh?
        return null;
      }
    });
  }
}

As expected the Java version requires quite a bit more ceremony, even though the semantics of the two templates are identical.  The Solver interface (used for Java) is as follows:

package util.java;

public interface Solver {
   public Object solve();
}

Finally, the Runner class, which is implemented in Groovy.  As you can see from the templates, it accepts either a Solver instance or a Groovy closure, and treats the two in exactly the same way, excepting that it picks between Solver.solve() and Closure.call() based on the type.

package util

import util.java.Solver

class Runner {

  private static ThreadLocal timer = new InheritableThreadLocal()
  private static int timerCount = 0

  def Runner(solver) {
    this(true, solver)
  }

  def Runner(doIt, solver) {
    if (! doIt) {
      return
    }
    startTimer()
    // this will send an update every five or so seconds
    volatile def updateThread
    updateThread = Thread.startDaemon {
      while (true) {
        Thread.sleep(5000)
        if (updateThread == null) {
          break // after the sleep, before the log
        }
        log("still executing...")
      }
    }
    try {
      log("starting...")
      def result;
      if (solver instanceof Solver) {    // the closure/Solver switch
        result = solver.solve()          //  |
      } else { // better be a Closure    //  |
        result = solver.call()           //  |
      }                                  //  V
      stopTimer()
      if (result instanceof String) {
        try {
          result = new BigInteger(result)
        } catch (e) {
          // oh well
        }
      }
      if (result instanceof Integer || result instanceof Long || result instanceof BigInteger) {
        log("result: $result")
      } else {
        def s = "" + result
        if (s.length() > 100) {
          s = s[0..<97] + "..."
        }
        log("result was not numeric: $s", System.err)
      }
    } finally {
      updateThread = null
    }
  }

  def elapsed() {
    get().elapsed
  }

  def formatElapsed(e) {
    def seconds = e.intdiv(1000)
    def millis = e % 1000
    def s = ""
    if (seconds >= 60) {
      s += seconds.intdiv(60) + ":"
      seconds = seconds % 60
    }
    if (s.length() > 0 && seconds < 10) {
      s += "0$seconds"
    } else {
      s += seconds
    }
    if (millis < 10) {
      s += ".00$millis"
    } else if (millis < 100) {
      s += ".0$millis"
    } else {
      s += ".$millis"
    }
    s
  }

  def log(Object msg, out=System.out) {
    def t = get()
    out.println "[${t.index}: ${formatElapsed(t.elapsed)}] $msg"
  }

  private startTimer() {
    Runner.timer.set(new TimerData(++timerCount))
  }

  private stopTimer() {
    get().stop()
  }

  private get() {
    def t = Runner.timer.get()
    if (! t) {
      throw new IllegalStateException("No timer has been started...")
    }
    t
  }
}

class TimerData {
  def index
  def startDate
  def stopDate

  def TimerData(threadIndex) {
    index = threadIndex
    startDate = new Date()
  }

  def stop() {
    stopDate = new Date()
  }

  def getElapsed() {
    if (running) {
      return new Date().time - startDate.time
    } else {
      return stopDate.time - startDate.time
    }
  }

  boolean isRunning() {
    stopDate == null
  }
}

There's a lot of stuff going on there, but primarily it's setting up a timing framework, a background thread to tell you it's still running every 5 seconds, and doing some magic with the returned solution.  As you can see, the constructor takes an optional first parameter for whether to execute.  Useful if you have multiple Runners in a single script/class, but only want to run some of them.

It wasn't my intent, but building this out was quite educational in the way the Java/Groovy interplay works.  Write Groovy for whatever you can, and write Java for the bits that need it.  As you can see from the templates, Java requires a lot more work, but it has benefits.  I was quite pleased to be able to use my Groovy harness (which leverages all kinds of Groovy goodness) for my Java solutions.  I originally figured I'd have to port it to Java to use it with both languages, but not the case.

Prime Factorization

In my ongoing rampage through the Project Euler problems I've needed a prime factorization routine several times, and figured I'd share it as a complement to the sieve I shared last week.  Implemented in Groovy, of course.  Typing is a little strange on first glimpse because Groovy uses Integer by default which only supports values up to 2.1 billion.  So there are a number of explicit BigInteger types/casts.

def factor(BigInteger num) {
  def factors = []
  while (num % 2 == 0) {
    factors.push(2)
    num = (BigInteger)(num / 2)
  }
  BigInteger end = floor(sqrt(num))
  for (def currFactor = 3; num > 1 && currFactor <= end; currFactor += 2) {
    if (num % currFactor == 0) {
      while (num % currFactor == 0) {
        factors.push(currFactor)
        num = (BigInteger)(num /currFactor)
      }
      end = (BigInteger) floor(sqrt(num))
    }
  }
  if (num != 1 || factors.size() == 0) {
    factors.push(num)
  }
  factors
}

The algorithm is pretty simple: loop through all odd numbers and see if they're a factor of the target number.  There's a special case at the top for two (the only even prime), which then lets the loop skip all the even numbers (effectively cutting it's work in half).  There's also a special case at the bottom for ones.  If you attempt to factor one or a prime number, you'll get that number back as the only item in the list, otherwise you'll get only prime factors (and no, one is not prime).  Note that the list can contain duplicates; the method returns all prime factors, not just distinct ones.

Why does this work?  Because by the time the loop gets to each number all of it's factors have been processed, so it can't be both a factor of the target number and composite (non-prime).  So if it's a factor, it has to be prime as well.

For those of you who follow me on Twitter (@barneyb), this is not the crappy algorithm I first implemented.  This is the good one.  My first attempt was an "inversion" of the sieve algorithm which resulted in huge amounts of unneeded work and memory consumption because of the data structures involved.  The second algorithm is virtually identical, just the data structures are different (enormously simpler).  I wasn't really that far off with the first attempt, I just didn't do a good job of distilling the problem down to it's essence, so I was doing a lot of extra work (three-orders-of-magnitude extra).