JSPs vs. ColdFusion

Thank GOD for ColdFusion. After doing a bunch of JSP stuff, I can't imagine trying to be productive without the tools that CF provides. The less that elegant syntax of the JSTL is annoying, but it's the tight integration that I really miss. You don't really appreciate the CFMAIL tag (for example) until you have to spend a few hours getting JavaMail, the Java Activation Framework, the Jakarta mailer taglib, configure it all, and then figure out how to integrating it with the JNDI setup on Tomcat (I gave up and just use parameters in web.xml). Granted, it's a one time deal, but boy is it a pain in the ass.

And then when you want to do a grouped loop on a recordset (didn't find a simple solution)….
Or pull all characters up to the first space in a string (nothing like digging out a java.util.StringTokenizer)….

Yeah, CF is way nice, and those who knock it are either gluttons for pain, or have rich clients paying them by the hour. Especially since you barely sacrifice any of J2EE's power when using CF. The VERY simple app that prompted all this can be found here: Lindsay's Alphabet Book. One template, 168 lines (layouts not included), containing a handful of conditionals, a single email sender, and 3 queries.

Refining

I found a couple slight glitches with the way I integrated all my non-MovableType pages into the MovableType layouts (I'll write up how all that works at some point). The biggest offender was the lack of use of the 'blog' class on a DIV to add some margins around the page content, which pushed the page titles right up against the top bar. That's fixed now though. I also added a few Jokes, just for the hell of it.

PasswordSafe Updated

I've just released PasswordSafe v0.3.3. The only external change is that changing a file's master key will actually work now. Previously, it would only take if you change the key, and then immediately restarted the application, but now it'll work even if you continue to use the application after changing the key.

As always, the new release can be downloaded here and unzipped/untarred directly over your current installation, as long as PasswordSafe isn't running at the time. All preferences will be maintained across the upgrade.

To Build a Better Terminal

I'd yet to find a terminal application with all the features I wanted. The built-in Terminal application was questionable at best, and I'd been using xterms in X11 for a while, which worked quite well. However, they had one really annoying feature: the 'delete' key didn't work when using emacs over SSH to a Linux machine.

Then I rediscovered iTerm, which is an OSX-only terminal emulation program. Not only did it handle the 'delete' key correctly, but it also uses the 'option' key as META, rather than the 'command' key. That lets you use normal OSX shortcuts (particularly copy and paste) within your terminal session. Very nice.

And then to top it off, I ran across this very good HOWTO (actually a blog entry), that lays out how to
set up GNU `ls` on OSX for colored output support. Remarkably brainless, but not something I'd even considered before. Now I've finally got a really nice command line interface to use on the Mac.

Web Usage Stats

In addition to the raw bandwidth stats that I'd previously made available, I set up actual web usage stats as well, using the most excellent Webalizer log analysis package. The first page is a summary of the last twelve months; detailed statistics are available by clicking on the month names in the table. Unlike the bandwidth graphs, the usage statistics are only updated once a day (at 4:02am PST).

Locking in CFMX

There was a post on CF-Talk regarding specifics of locking, and I thought I'd create a summary (though I'm not Ben Forta, who was requested by name), along with some ideas for making the job simpler.

In CF5 and less, CFLOCK was required for shared memory access, as well as race conditions. With CFMX, CFLOCK is only required for race conditions, because the underlying Java runtime takes care of the shared memory access issues. Great, but what is a race condition?

Race Condition: A situation where two different requests are manipulating a single resource, and it's possible for the two requests to step on each other's toes.

The easiest example is something like these two queries (which deducts an item from the inventory of a given product):

<cfquery datasource="#request.dsn#" name="get">
  SELECT inventory
  FROM product
  WHERE productID = 1
</cfquery>
<cfquery datasource="#request.dsn#">
  UPDATE product SET
    inventory = #get.inventory# - 1,
    inventoryUpdate = now()
  WHERE productID = 1
</cfquery>

If two separate requests start this process perfectly in parallel they'll both get the same result from the first query (say 4), and then when they run the second query, they'll both update the inventory to 3. This is clearly NOT the correct result (it should be 2).

We can solve this problem several ways, but we'll use a CFLOCK statement to do it:

<cflock name="inventoryupdate" type="exclusive" timeout="10">
  <cfquery datasource="#request.dsn#" name="get">
    SELECT inventory
    FROM product
    WHERE productID = 1
  </cfquery>
  <cfquery datasource="#request.dsn#">
    UPDATE product SET
      inventory = #get.inventory# - 1,
      inventoryUpdate = now()
    WHERE productID = 1
  </cfquery>
</cflock>

What we've done is single-thread access to these two queries. Running through the example again, the first request would enter the CFLOCK, and the second would wait for the first to complete. The first would select 4, update to 3, and then exit the CFLOCK. Then the second request would enter the CFLOCK, select 3, update to 2, and exit. Problem solved.

There are better ways to solve this particular problem (relative updates and transactions spring to mind), but this entry is about CFLOCK. What's important is the general type of thing that's happening: a multi-step process concerning a resource that is shared across requests, where later steps depend on results of earlier steps, or the steps must happen all-or-nothing.

Where else might we find these kind of problems?

Well, all CF variables in shared scopes (server, application, session, client) are resources that are shared across requests. This includes instance variables within CFCs that are stored in one of these scopes. Corollary to this last item is the fact that local variables in CFC methods not declared with the var keyword are instance variables. This can result in mysterious bugs that only ever crop up under load, so it's VERY important to use the var keyword properly.

We also find it in database access, as demonstrated above, though it's usually better to use CFTRANSACTION to solve those problems, since database-level transactions are very likely going to be a lot more efficient.

Finally, we see it other external resources. The most common is files on the filesystem, though external objects (Java, COM) are another. Many objects are internally synchronized, so you needn't worry about locking, but not all. Make sure you check the documentation of your specific object. Notably, most of the Java Collection Classes are NOT synchronized, though there are static methods in the Collections class to turn them into synchronized versions of themselves.

Well, we know we don't have to CFLOCK all access to shared scopes (that went the way of the dodo in CFMX), but when do we have to lock? We again look at the type of operation we need. Clearly reading and writing single variables doesn't qualify, but reading and writing multiple variables does.

So what does this really mean? If you ever write a variable in a shared scope, and any code that depends on it also depends on any other shared value, you must lock all access to the shared variable, both read and write. Ouch. That's a lot of locking, because every variable has to be written, or it wouldn't exist, so that means you have to lock everything except stand-alone variables.

But fret not, because CFLOCK isn't the only way to lock variable access. You can use some tricks to avoid having to use CFLOCK all over the place. The best one is for application variables that get initialized once, and never change. Since there is only one write event, we can break their lifecycle in two: the write phase, and the read phase. All we need to do is assure that no request will EVER get to the read phase before the write phase is complete, and that no request will EVER perform the write phase if it has already been performed. If we do that, then we never need to use CFLOCK on application variable reads. The question, of course, is how do we do that? Here's the way I prefer (in Application.cfm, or the root settings file):

 1. <cflock scope="application" type="readonly" timeout="10">
 2.   <cfset isAppWritten = structKeyExists(application, "appWritten") />
 3. </cflock>
 4. <cfif NOT isAppWritten>
 5.   <cflock scope="application" type="exclusive" timeout="10">
 6.     <cfif NOT structKeyExists(application, "appWritten")>
 7.       <!--- set your app variables --->
 8.       <cfset application.appWritten = true />
 9.     </cfif>
10.   </cflock>
11. </cfif>

Why does this work? First we test if we're through the write phase (lines 1-4). If we are, great, otherwise we have to attempt to perform it ourselves. Assuming it's not complete, we then get a lock on the initialization code (line 5). Once we get the lock (potentially waiting for other requests to release it), then we again check if the write phase is complete (line 6). We need the second check, because it's possible that while we were waiting for the lock, another request might have finished. If it's still not done, then we perform the write phase and exit the lock (lines 7-11).

There is a slight fudge going on for efficiency. The outer CFIF is unneeded, because the inner one will work by itself (though the reverse is NOT true). However, getting exclusive locks is expensive (and kills scalability), so we want to avoid it where possible, especially since this code will be executed by EVERY request. The outer CFIF is ensuring that no request will have to get the lock unless it comes in before the first request finishes the write phase, which basically translates to never.

"But what about CFC instance variables?", you're probably saying. "They're application variables too, and they're definitely going to get manipulated, or they'd just be normal application-scope variables." Time for another 'trick', though this one is far less sneaky: we don't have to lock application variables only with a scope="application" CFLOCK.

Instead, inside our CFC, we'll lock all access to instance variables using a named lock. Then the non-CFC application code can still reference the application-scope instances without aquiring a lock, but we retain our ability to prevent race conditions. I perfer to use a UUID for my locking, which is set in the init() method of the CFC into an instance variable. That UUID is then used to lock all instance variable access using a named CFLOCK in exactly the same way as we'd used scoped CFLOCK for "normal" variables.

<cffunction name="init">
  <cfset variables.my.uuid = createUUID() />
  <!--- set inventory variables --->
</cffunction>

<cffunction name="getInventory">
  <cfreturn variables.my.inventory />
</cffunction>

<cffunction name="setInventory">
  <cfargument name="inventory" />
  <cflock name="#variables.my.uuid#" type="exclusive" timeout="10">
    <cfset variables.my.inventory = inventory />
    <cfset variables.my.inventoryUpdate = now() />
  </cflock>
</cffunction>

There are two caveats:

  1. CFC don't have real constructors, meaning that it's possible to call the init() method multiple times (bad) or call other methods before calling init() (even worse). What does this mean? You need to take a couple precautions. First, all methods should fail if init() hasn't been called. Most CFCs are like this anyway, because they depend on initialization parameters (like a DSN). Second, calls to the init() method must be externally locked. Fortunately, since we're creating and initializing all our application-scope CFCs within the locking framework discussed above, that's already taken care of as well. Just be careful of non-application-scope CFCs (like session-scope).
  2. This type of locking only keeps the CFC's internals in sync. It is still suceptible to the exact same problem we ran into with the first example using two queries (coincidentally, performing the exact same operation). So if in our application code (outside the CFC) we call getInventory() and follow with a setInventory() that uses the value, we still have to lock it on our end, just like the first example.

For external resources, locking is a bit trickier. Files on the local file system are easy, always use a named lock on the canonical absolute pathname. Files on a remote filesystem shared between servers are problematic, because there's no way to use CFLOCK across multiple servers. You'd have to use some kind of semaphore file, and then lock access to that, and it turns into a mess very quickly. External objects can usually be locked using their class name (like files), if they're local. Remote shared objects should have built-in synchronization.

Fusedocer

I just got a request for my old FB3 Fusedocer tool off the FB forums. So I dredged it up from the remains of my old site at barneyboisvert.com (which enom/domainzero ate), and reposted it here.

Internet

We finally got an internet connection at our house! Yay! Really quite humorous, when you think about it. Here I am, with my (and my family's) livelyhood wholy dependant on the internet and I didn't even have access to it except at my office. Oh well, all better now.

Living Mac

So, the tech today figured it out. One of my sticks of RAM was bad, and it was screwing things up. Pulled it out and everything seems to be back to normal. Bastards at MacMall, after shipping the computer all over hell and back (3 wrong addresses), were kind enough to throw bad ram in there. Gotta love that.

Still a Dead Mac

After leaving my PowerBook at the Apple store in Portland all day, it's still hosed. They can't figure out what the heck is wrong with it. It's now gone through 6 separate installs of OS X from a formatted disk since Friday, and it's still dead. Hopefully the saga will play itself out today so I can return to Bellingham with a working computer, but we'll see.