Archive for the 'coldfusion' Category

Build-Time Aggregation of JS/CSS Assets

Ben Nadel posted about compiling multiple linked files (JS/CSS) into a single file this morning, and he does it at runtime. I commented about doing it at build-time instead, and a couple people were wondering more, so here's a brief explaination.

The first part is a properties file (which can be read by both Ant and CF (or whatever)). Here's an example (named agg.js.properties):

# the type of file being aggregated (used to do minification)
type         = js
# the URL path the files are relative to.
urlBasePath  = /marketing/js/
# the list of filenames to aggregate.  The first line (with the equals
# sign) should be a filename and a slash, all other lines should be a
# comma, a filename, and a slash  Indentation is irrelevant.
filenames    = date.js\
  ,jquery-latest.js\
  ,ui.datepicker.js\
  ,ui.mouse.js\
  ,ui.slider.js\
  ,ui.draggable.js\
  ,jquery.dimensions.js\
  ,jquery.easing.1.2.js\
  ,jquery-easing-compatibility.1.2.js\
  ,coda-slider.1.1.1.js\
  ,jquery.tooltip.min.js\
  ,jScrollPane.min.js\
  ,jquery.metadata.js\
  ,prototype.classes.js\
  ,reporting.js\
  ,jquery.ajaxQueue-min.js\
  ,script.js

This sets up the everything needed for the aggregation. Within our project, we have this file as a peer of the property file (named agg.js.cfm):

<cfscript>
filename = replace(getCurrentTemplatePath(), ".cfm", ".properties");
fis = createObject("java", "java.io.FileInputStream").init(filename);
bis = createObject("java", "java.io.BufferedInputStream").init(fis);
props = createObject("java", "java.util.Properties").init();
props.load(bis);
urlBasePath = props.getProperty("urlBasePath");
type = props.getProperty("type");
filenames = listToArray(props.getProperty('filenames'));
for (i = 1; i LTE arrayLen(filenames); i = i + 1) {
	if (type EQ "css") {
		writeOutput('<link rel="stylesheet" href="#urlBasePath##filenames[i]#" type="text/css" />');
	} else { // js
		writeOutput('<script src="#urlBasePath##filenames[i]#" type="text/javascript"></script>');
	}
	writeOutput(chr(10));
}
</cfscript>

It reads the properties file, and writes out either LINK or SCRIPT tags as appropriate to the individual assets. This facilitates easy debugging in development, because nothing is modified from it's source. The file is included into the HEAD of our layout templates to get everything in page.

The real magic happens with Ant, which we use for our deployments. Within the build file, we have a call to the aggregateAssets target for each properties file:

<antcall target="aggregateAssets">
  <param name="propfile" value="${output}/wwwroot/marketing/templates/agg.js.properties" />
  <param name="rootdir" value="${output}/wwwroot/marketing/js" />
</antcall>

The params specify the properties file and the root directory. Note that the rootdir param corresponds with the urlBasePath in the properties file. The target itself looks like this:

<target name="aggregateAssets">
  <!-- read the aggregation properties -->
  <property file="${propfile}" prefix="agg" />

  <!-- get the root -->
  <propertyregex property="agg.root"
    input="${propfile}"
    regexp="^(.*)\.properties$"
    select="\1" />

  <!-- split the root into file and path sections -->
  <propertyregex property="agg.fileroot"
    input="${agg.root}"
    regexp="^.*/([^/]+)$"
    select="\1" />
  <propertyregex property="agg.pathroot"
    input="${agg.root}"
    regexp="^(.*/)[^/]+$"
    select="\1" />

  <!-- set up the output file stuff -->
  <property name="agg.outfile" value="${rootdir}/${agg.fileroot}" />
  <property name="agg.cfmfile" value="${agg.root}.cfm" />
  <property name="minsuffix" value=".yuimin" />

  <!-- run everything through the YUI Compressor -->
  <for list="${agg.filenames}" param="filename">
    <sequential>
      <echo message="compressing @{filename} to @{filename}${minsuffix} (in ${rootdir})" />
      <java classname="com.yahoo.platform.yui.compressor.YUICompressor"
        failonerror="true"
        output="${rootdir}/@{filename}${minsuffix}"
        append="true"
        logError="true"
        fork="true">
        <arg value="--type"/>
        <arg value="${agg.type}"/>
        <arg value="--nomunge"/>
        <arg file="${rootdir}/@{filename}" />
        <classpath>
          <pathelement path="${java.class.path}"/>
        </classpath>
      </java>
    </sequential>
  </for>

  <!-- aggregate all the compressed files together -->
  <echo file="${agg.outfile}" message="// built by Ant using YUI Compressor" />
  <for list="${agg.filenames}" param="filename">
    <sequential>
      <concat destfile="${agg.outfile}" append="true">
        <header trimleading="true">
          // @{filename}
        </header>
        <filelist dir="${rootdir}" files="@{filename}${minsuffix}" />
      </concat>
    </sequential>
  </for>

  <!-- delete all the compressed files -->
  <delete>
    <fileset dir="${rootdir}" includes="*${minsuffix}" />
  </delete>

  <!-- write the CFM file to pull in the compressed and aggregated file -->
  <if>
    <equals arg1="${agg.type}" arg2="css" />
    <then>
      <echo file="${agg.cfmfile}"><![CDATA[<link rel="stylesheet" href="${agg.urlBasePath}${agg.fileroot}" type="text/css" />]]></echo>
    </then>
    <else>
      <echo file="${agg.cfmfile}"><![CDATA[<script src="${agg.urlBasePath}${agg.fileroot}" type="text/javascript"></script>]]></echo>
    </else>
  </if>
</target>

First, it reads the properties file, runs each listed asset through the YUI Compressor, and then aggregates the result. Finally, it overwrites agg.js.cfm (from above) with one that contains a single LINK/SCRIPT element to the aggregation result. End result is a single aggregated, compressed asset in production for speed, and separate uncompressed assets in development for easy debugging.

Edit: Do note that you'll need both the ant-contrib package and the YUI Compressor JARs to be installed into Ant for this to work.

S3 is Sweet (One App Down)

This weekend I ported my big filesystem-based app to S3, and it went like a dream. It's a image-management application, with all the actual images stored on disk. In addition to the standard import/edit/delete, the app provides automatic on-the-fly thumbnail generation, along with primitive editing capabilities (crop, resize, rotate, etc.). With images on local disk, that's all really easy: read them in, do whatever, write them back out. I figured using S3 would make things both more cumbersome and less performant. Both suspicions turned out to be unwarranted.

Building on the 's3Url' UDF that I published last week, I whipped up a little CFC to manage file storage on S3 with a very simple API. It has s3Url, putFileOnS3, getFileFromS3, s3FileExists, and deleteS3File methods, which all do about what you'd expect. You can grab the code here: amazons3.cfc.txt (make sure you remove the ".txt" extension) or visit the project page. It uses the simple HTTP-based interface, so after the authentication is handled, it's all very simple and fast. I haven't looked at the SOAP interface - why bother complicating a simple task?

With that CFC (and an application-specific wrapper to take care of some path-related transforms), porting the whole app took about two hours. I also realized after I was mostly done that the CF image tools accept URLs as well as files, so I switched my image reads to just use URLs instead of pulling the file local and reading it from disk.

As for moving all the actual content, S3Sync was a champ, moving about 4.5GB of data from my Cari server to S3 in a few hours, including gracefully handling a couple errors raised by S3 (which a retry - performed automatically - solved), and a stop/restart in the middle. Total cost: about 65 cents.

Next is porting the blogs, including all the Picasa-based galleries. Unfortunately, that means writing PHP, but with how easy the CF stuff was, I don't think it'll be too much effort.

Dummy Queries in ColdFusion 8.0.1

Brian Rinaldi posted on his blog about dummy queries in CF 8.0.1, and it struck me as a weird solution. So here's a drop-in replacement, that I think works in a more reasonable fashion, and doesn't have any dependency on an existing DSN.

<cffunction name="dummyQuery2" access="public" output="false" returntype="query">
  <cfargument name="queryData" type="struct" required="true" />
  <cfset var i = 0 />
  <cfset var columnName = "" />
  <cfset var myQuery = queryNew(structKeyList(queryData)) />
  <cfset var queryLength = arrayLen(arguments.queryData[listFirst(structKeyList(arguments.queryData))]) />
  <cfloop from="1" to="#queryLength#" index="i">
    <cfset queryAddRow(myQuery) />
    <cfloop collection="#arguments.queryData#" item="columnName">
      <cfset querySetCell(myQuery, columnName, queryData[columnName][i]) />
    </cfloop>
  </cfloop>
  <cfquery dbtype="query" name="myQuery">
    select *
    from [myQuery]
  </cfquery>
  <cfreturn myQuery />
</cffunction>

As you can see, the structure is almost identical, but it doesn't use a database, it just builds in memory. The "no-op" QofQ at the end is to ensure there is actual query metadata, not just the raw records, which Brian listed as one of his prerequisites. If you don't care, it can be removed with no ill effects.

One interesting benefit of this approach is that the rows come out in the same order as they go in - with Brian's DB-based one, that's not guaranteed because there is no ORDER BY clause on the query. Running his example on my box (using MSSQL 2005), I got rows sorted by first name. With the in-memory building, the rows are explicitly kept in order throughout.

Amazon S3 URL Builder for ColdFusion

First task for my Amazon move is getting data assets (non-code-managed files) over to S3. I have a variety of types of data assets that need to move and have references updated, most of which require authentication. To make that easier, I wrote a little UDF to take care of building urls with authentication credentials in there.

<cffunction name="s3Url" output="false" returntype="string">
  <cfargument name="awsKey" type="string" required="true" />
  <cfargument name="awsSecret" type="string" required="true" />
  <cfargument name="bucket" type="string" required="true" />
  <cfargument name="objectKey" type="string" required="true" />
  <cfargument name="requestType" type="string" default="vhost"
    hint="Must be one of 'regular', 'ssl', 'vhost', or 'cname'.  'Vhost' and 'cname' are only valid if your bucket name conforms to the S3 virtual host conventions, and cname requires a CNAME record configured in your DNS." />
  <cfargument name="timeout" type="numeric" default="900"
    hint="The number of seconds the URL is good for.  Defaults to 900 (15 minutes)." />
  <cfscript>
    var expires = "";
    var stringToSign = "";
    var algo = "HmacSHA1";
    var signingKey = "";
    var mac = "";
    var signature = "";
    var destUrl = "";

    expires = int(getTickCount() / 1000) + timeout;
    stringToSign = "GET" & chr(10)
      & chr(10)
      & chr(10)
      & expires & chr(10)
      & "/#bucket#/#objectKey#";
    signingKey = createObject("java", "javax.crypto.spec.SecretKeySpec").init(awsSecret.getBytes(), algo);
    mac = createObject("java", "javax.crypto.Mac").getInstance(algo);
    mac.init(signingKey);
    signature = toBase64(mac.doFinal(stringToSign.getBytes()));
    if (requestType EQ "ssl" OR requestType EQ "regular") {
      destUrl = "http" & iif(requestType EQ "ssl", de("s"), de("")) & "://s3.amazonaws.com/#bucket#/#objectKey#?AWSAccessKeyId=#awsKey#&Signature=#urlEncodedFormat(signature)#&Expires=#expires#";
    } else if (requestType EQ "cname") {
      destUrl = "http://#bucket#/#objectKey#?AWSAccessKeyId=#awsKey#&Signature=#urlEncodedFormat(signature)#&Expires=#expires#";
    } else { // vhost
      destUrl = "http://#bucket#.s3.amazonaws.com/#objectKey#?AWSAccessKeyId=#awsKey#&Signature=#urlEncodedFormat(signature)#&Expires=#expires#";
    }

    return destUrl;
  </cfscript>
</cffunction>

To use it, do something like this:

s3Url(aws_key, aws_secret, "s3.barneyb.com", "test.txt", 'cname');

That will generate a request to the file "test.txt" in the "s3.barneyb.com" bucket, using a CNAME-style URL. Obviously you'll have to know my AWS key and secret for it to work, and I'm not telling, but substitute your own values. You can use regular (bucket name in the request), vhost (bucket name in an S3 subdomain), cname (a vanity CNAME pointing at S3), or ssl (regular over HTTPS) for the 5th type parameter to control the style of URL generated.

Edit: here's a link to the project page.

New Cyclic Data Structures Utility

Back in October I posed a fledgling cycle-safe CFDUMP replacement.  Today, I had need for that same anti-cycle processing in serializeJson, so I abstracted the processing out into a CFC that handles both breaking cycles (for serialization) as well as restoring them (for deserialization).  By running a cyclic data structure through the breakCycles method, you can use CFDUMP, serializeJson, or whatever other context-free recursive algorithm you want on it without fear of infinite looping.  If you later turn that data structure back into an in-memory structure, you can use the restoreCycles method to recreate the cyclic references that breakCycles removed.

You can download the CFC (as a text file) here: cyclicutils.cfc.txt.  If you have the example from the cycle-safe CFDUMP somewhere, save the CFC in the same directory and tack this code on to the end of the test case:

<cfset cu = createObject("component", "cyclicutils") />
<cfset b = cu.breakCycles(b) />
<cfdump var="#b#" label="b" />
<cfset b = cu.restoreCycles(b) />
<u:dump var="#b#" />

You'll see the cyclic structure dumped with the cycle-safe CFDUMP as before, then the cycles are broken and it's dumped with the standard CFDUMP, and then the cycles are restored and it's dumped with the cycle-safe CFDUMP again.

Otherwise, just create yourself a cyclic structure and pass it to breakCycles.

Report/Query DSLs Update

I've posted a simple demo app for both DSLs that you can play with. It's included in the distribution as index.cfm, so you'll get it if you pull down the source from SVN. I've also created a readme.txt file in the distribution with the text of my intro blog post.

I also made a very minor (though backwards incompatible) update to my Query DSL implementation. There is a convertToSimpleCriterion method, designed to be overridden by subclasses if necessary. The initial implementation contained the logic for processing negated terms (those prefixed with '-' or '!'), and for splitting terms into predicate/value pairs. Since that's part of the DSL implementation's job, that needed to b done by the internal methods, so the behaviour would be inherited by any subclasses. I've changed the convertToSimpleCriterion method to be passed all three parts individually, so the parsing can live inside the core of the implementation.

Report/Query DSLs

I use a pair of stacked DSLs (Domain Specific Languages) for searching, reporting, and goal management in a couple applications. A discussion at work (with Joshua and Koen) provided the solution for the final piece I wanted to implement before releasing them. The first layer is for parsing query strings, and the latter for using query strings to build up report structures and such.

For the impatient, code is available at https://ssl.barneyb.com/svn/barneyb/reporting_dsl/trunk/. You want to read the docs in querydsl.cfc and reportdsl.cfc.

The Query DSL is quite simple:

  • spaces delimit terms
  • terms are ANDed together
  • terms can be quoted if they contain spaces
  • terms can contain a prefix (predicate) separated from the value by a colon
  • a leading dash (or exclamation point) can be used to negate a term
  • the OR keyword (caps mandatory) can be used to create alternatives
  • parentheses can be used to group terms

Here's some examples:

  • cat type:pet
  • cat -manx
  • cat OR dog
  • cat (-manx type:"house pet") OR dog

The last item is not possible with Google Search constructs, and is the only construct available that Google doesn't provide. Which isn't to suggest that the DSL supports the gamut of features Google supports.

The result of using the DSL parser is a graph of 'criterion' instances. For example 'andcriterion', 'orcriterion', or 'simplecriterion'. The second example above would return an 'andcriterion' instance with it's left side pointing to a 'simplecriterion' instance for "cat", and the right side pointing to another (negated) 'simplecriterion' instance for "manx". More complex expression result in more complex graphs, but they graph is always singly rooted.

The Report DSL is a bit more complex, and leverages the Query DSL. It is line oriented, with a single-character command specifier at the beginning of each line. A document contains one or more "condition" lines (prefixed with a '+' or an '&') which are comprised of a value and a query string. A document may also have one or more "globals" lines (prefixed with a '+') that only contain a query string, and which is added to each condition's query string (to reduce duplication). Here's a sample document:

+ tag:dining -tag:home
? McDonalds : tag:mcdonalds
? Jack in the Box : tag:"jack in the box"
? Cheesecake Factory : tag:"cheesecake factory"
& Fast Food : tag:"fast food"
& Total : *

There is no real difference between the '?' and '&' lines (referred to as conditions and aggregates respectively), the two variants are there to provide a built-in differentiator. Here I'm using them as their names imply: the conditions are raw conditions, and the aggregates aggregate the conditions together. If this report document were used to generate a chart, you might see the conditions as column series and the aggregates as line series.

The result of parsing a document is a structure with 'coreCriterion', 'conditions' and 'aggregates' keys, all of which are optional if the document doesn't specify anything to fill them. The former holds a criterion graph (since it's just a query string). The latter two are arrays of structs, where each struct contains 'value', 'query', and 'criterion' keys. The first two are simply the two halves of the corresponding line of the document, and the 'criterion' key holds the parsed query string.

So what is this good for? By itself, not much, but with a method for converting between a criterion graph and a SQL expression, the Query DSL will give you an easy way to build very granular search functionality on your site. Once you have that, leveraging the Report DSL allows creation of fairly complex reports with a minimum of fuss (do convert each condition's criterion to SQL, run your query, and feed it into a chart series). Of course, the output needn't be a chart, but that's a good example.

What's most important here is that both of these DSLs are for users to use, not coders. I.e. strings/documents are created by users to customize the behaviour of your application, allowing them to build very specific reports very easily. And the definitions for those reports are simple strings, which makes persisting them a breeze.

With a little more creativity, you can get other behaviour. I alluded to goal management up top. If you were to take the report document above, remove the aggregates (the '&' lines), and replace the values (restaurant names) with integers, according to how desirable they are, you'd have a way to rank given dining adventures. Write yourslef a method for converting the parsed document into a SQL CASE..END statement (leveraging your criterion-to-SQL method, of course), you have a really easy way to assign a "points" value to the items you're querying. Wrap that with some sort of structure for managing a points-per-time-period structure, and you have a very flexible and easy to use goal tracking system.

Like so much else, I don't know how useful this is for others. The Query DSL probably is, but the Report DSL might not be. But it's made my life much easier, so worth the effort I've put into it. Code is only available via Subversion (browser or SVN client) at the URL above. All paths are relative, so you can svn:externals (use a revision number!) it into your existing package-space and reference the CFCs natively. There are docs and more examples in the files as well.

The Custom Tag Body-Scope

I was working with FlexChart a little this evening and ran into an interesting situation with a potentially very useful solution. I don't claim to be the first to think of it, but it's the first time I've used/seen it.

I added a date preparation tag a while back, but I thought it'd be nicer to have a UDF that you could use instead, when you're building a descriptor inline. What I realized is that for very little effort, you can easily scope variables (UDFs or otherwise) to the body of a custom tag. Here's how it works:

<cfif thisTag.executionMode EQ "start">
  <cfset injectedPrepChartDate = false />
  <cfif NOT structKeyExists(caller, "prepChartDate")>
    <cfif NOT structKeyExists(variables, "prepChartDate")>
      <cfinclude template="udf_prepchartdate.cfm" />
    </cfif>
    <cfset caller["prepChartDate"] = prepChartDate />
    <cfset injectedPrepChartDate = true />
  </cfif>
<cfelse> <!--- end --->
  <cfif injectedPrepChartDate>
    <cfset structDelete(caller, "prepChartDate") />
  </cfif>
</cfif>

First, in the start tag, I (conditionally) add a prepChartDate UDF to the caller scope, and set a variable for whether I did it or not. Then, in the end tag, I delete the UDF if I'd added it to the caller. The part that I've greyed out is only needed because I'm using a UDF; it's a simple don't-double-define-a-function check. In this case, I'm opting to only expose the UDF if it won't conflict with what is already there in the caller scope. It'd be just as reasonable to shadow the pre-existing variable and restore it after the body. This latter approach is basically how Fusebox implements DO parameters, if you've ever used them, though it's mechanism is both far more elaborate and far more flexible.

I've obviously removed everything else from the tag definition. If you want to see the full implementation of xmlchart.cfm (where I used it), you can see it in Subversion (look near the bottom), and/or see it in action on the demo.

New FlexChart Demo

I've updated the FlexChart demo to include display of the descriptor XML that is loaded into the chart, as well as providing a way to edit the XML inline and load your modified XML into the chart client-side.  In addition to being far easier to experiment with, it also showcases the the client-side redrawing of the chart and gives a hit of how powerful the engine can be within a JS UI.

Doing it this way means I lost the demo of having the chart request a new descriptor from the server-side on it's own, but the client-side descriptor injection is a "neater" capability, I think.   I haven't posted an updated zip of the source (including the demo app), but it's available from Subversion at the URL listed at the bottom of the demo.

A Very Wonky Request/CFTHREAD Bug

Found some really interesting behaviour with CFTHREAD over the past couple days of testing.  Put this code into three browser tabs and run them concurrently (so you get three simultaneous requests which you can view the output of).  And ensure you have the CF Server Monitor open in a different window with monitoring enabled and the Active ColdFusion Threads report showing.

<cfoutput>
<cfapplication name="#createUUID()#" />
<cfset request.threadName = createUUID() />

<h1>Request: #timeFormat(now(), "HH:mm:ss")#</h1>
<cfflush />
<cfdump var="#cfthread#" label="cfthread is empty" />

<cfthread action="run"
  name="#request.threadName#">
  <cfthread action="sleep" duration="10000" />
</cfthread>

<cfdump var="#cfthread#" label="one thread running" />
<h1>Thread: #timeFormat(cfthread[request.threadName].startTime, "HH:mm:ss")#</h1>
<cfflush />

<cfif structKeyExists(url, "join")>
  <cfthread action="join" name="#request.threadName#" />
</cfif>

<h1>Complete: #timeFormat(now(), "HH:mm:ss")#</h1>
<cfdump var="#cfthread#" label="thread shows completed" />
</cfoutput>

What happens?  As expected, all three requests return immediately, each having spawned a thread.  The server monitor shows that the threads all go away after ten seconds.  Most importantly, all three requests started, launched their thread, and completed at the same time (+/- a few tens of milliseconds).

Now go to the first tab and add "?join" to the URL, which will trigger the joining of the thread back to the request thread.  Refresh all three tabs again.  This time, the tabs without the join command behave exactly the same, but the tab with the join waits for the thread to complete before returning, as evidenced by the "Completed" timestamp.  This is all correct behaviour, so lets get down to it.

Add "?join" to the second tab (so now two tabs will join), and refresh all three tabs a third time.  Notice that the second tab that joins doesn't even start until the first joining request completes.  The non-joining request completes immediately, as always, initiating at the same time as the first joining request.

I don't even know how to characterize this issue.  It's certainly a bug, but I'm not sure what.  Even more baffling, I can't even come up with a theoretical scenario that would cause these symptoms.  Anyone?