<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>BarneyBlog &#187; development</title>
	<atom:link href="http://www.barneyb.com/barneyblog/category/software-development/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.barneyb.com/barneyblog</link>
	<description>Thoughts, rants, and even some code from the mind of Barney Boisvert.</description>
	<lastBuildDate>Thu, 02 Sep 2010 22:10:49 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Groovy Gravity Processing</title>
		<link>http://www.barneyb.com/barneyblog/2010/05/28/groovy-gravity-processing/</link>
		<comments>http://www.barneyb.com/barneyblog/2010/05/28/groovy-gravity-processing/#comments</comments>
		<pubDate>Fri, 28 May 2010 21:42:29 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[groovy]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=1534</guid>
		<description><![CDATA[Joshua (a coworker) and I have been talking about gravity simulation for a while, and this week I threw together a very simple model to do exactly that.  This grew out of a game called Flotilla that he came across somewhere and has been working with the developer to add a network multiplayer mode.  Flotilla, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.joshuafrankamp.com/">Joshua</a> (a coworker) and I have been talking about gravity simulation for a while, and this week I threw together a very simple model to do exactly that.  This grew out of a game called <a href="http://www.blendogames.com/flotilla/">Flotilla</a> that he came across somewhere and has been working with the developer to add a network multiplayer mode.  Flotilla, however, doesn't have a concept of gravity, and doesn't really have an explicit concept of mass either, except the implicit stuff in it's acceleration constraints.</p>
<p>Gravity, however, is kind of nasty to work with (from a computational perspective), particularly in a game situation (where you want realism, but not so real the game play sucks).  Fortunately, it's also exactly the kind of thing that I enjoy wasting my free time playing with.  So I threw together a simple model in Groovy, including a little type system and DSL to help keep my numbers straight.</p>
<p>So now when I divide a Force by a Mass, I'll get an Acceleration back.  Or if I multiply an Acceleration by a Time, I'll get a Velocity.  This was a huge win.  In the world of paper calculations, you don't do just arithmetic, you do unit algebra as well, which helps you ensure your answer makes sense (e.g., if you solve for velocity and get units of "meters per kilogram second", you screwed up somewhere).  But with a computer you don't have units, so I employed a simple type system to afford the same check I'd usually get from unit algebra.  And Groovy makes it remarkably easy to create both an expressive type system and a simple syntax (via operator overloading).</p>
<p>Then I hacked up Processing ever so slightly so I could implement a PApplet in Groovy and avoid the compilation step, and built a simple viewer for the model.  Emphasis on "simple".  Here's a capture with three bodies:</p>
<p><a href="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/gravity-1.png"><img class="aligncenter size-full wp-image-1535" title="gravity-1" src="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/gravity-1.png" alt="" width="648" height="509" /></a></p>
<p>The bodies' sizes represent their relative mass, the green lines represent gravitation between them, the red lines are the velocity vector (per body), and the pink lines are acceleration (also per body).  Here's a few seconds later in the same simulation:</p>
<p><a href="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/gravity-2.png"><img class="aligncenter size-full wp-image-1536" title="gravity-2" src="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/gravity-2.png" alt="" width="648" height="509" /></a></p>
<p>Now, because the bodies are far closer, the forces between them are much greater, which is going to result in a couple fairly impressive slingshots in a moment.  They've also picked up a fair amount of speed, though their active acceleration has surpassed their velocities (i.e., slingshots are imminent).  Here's after the slingshots have happened:</p>
<p><a href="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/gravity-3.png"><img class="aligncenter size-full wp-image-1537" title="gravity-3" src="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/gravity-3.png" alt="" width="650" height="510" /></a></p>
<p>After the two larger bodies slingshotted around each other, the two smaller bodies did the same thing, resulting in a <em>rapidly</em> expanding model.  As you can see the forces are falling away, though the smallest and largest bodies will soon meet up and cross paths.  The velocity of the little one is high enough that the big one can't slow it down enough to do more than deflect it upward a bit.</p>
<p><a href="http://processing.org/">Processing</a> again completely delivered on it's promise of providing a remarkably simple way to sketch out visual stuff.  Simple event handling, simple setup, a simple draw loop, and solid primitives.  I need to do a little mucking about, but adding 3D support (which the model handles, but the viewer doesn't) will be similarly trivial: flip the "use 3D" bit, and add 'z' coordinates from the model in the various calls.  Very nice, especially when coupled with the <a href="http://groovy.codehaus.org/">Groovy</a> syntax and capabilities.</p>
<p>An interesting project, though we'll see how generally useful.  Code, of course, is <a href="http://subversion.assembla.com/svn/gravity/trunk/">available from SVN</a>.  It's a single Eclipse project, including Groovy, Processing, the hacked PApplet (from Processing), the model and the viewer applet.  Check it out and run driver.groovy to see it in action.  While running SPACE will pause/unpause the action, though there is currently no way to rewind/reset the simulation.  If you don't want to check out the project, you can <a href="http://www.assembla.com/code/gravity/subversion/nodes">browse it directly</a>.  The interesting bits are in the /src/com/barneyb/gravity/ package.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2010/05/28/groovy-gravity-processing/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Processing and Galcon</title>
		<link>http://www.barneyb.com/barneyblog/2010/05/14/processing-and-galcon/</link>
		<comments>http://www.barneyb.com/barneyblog/2010/05/14/processing-and-galcon/#comments</comments>
		<pubDate>Fri, 14 May 2010 23:26:14 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[development]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=1502</guid>
		<description><![CDATA[I did a little experiment last night using Processing, which is a Java-based visual programming environment that I've repeatedly run into in various different contexts, but had never really done anything with.  I've become completely addicted to Galcon Lite on my iPhone, and figured it was a "sample" to build with Processing.  Note that I [...]]]></description>
			<content:encoded><![CDATA[<p>I did a little experiment last night using <a href="http://processing.org/">Processing</a>, which is a Java-based visual programming environment that I've repeatedly run into in various different contexts, but had never really done anything with.  I've become completely addicted to <a href="http://www.galcon.com/">Galcon</a> Lite on my iPhone, and figured it was a "sample" to build with Processing.  Note that I was <strong>not</strong> attempting to replicate the look and feel of the game, just the mechanics.  If you don't know Galcon, here's a screenshot:</p>
<p><a href="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/galcon.jpg"><img class="aligncenter size-full wp-image-1503" title="galcon" src="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/galcon.jpg" alt="" width="320" height="480" /></a></p>
<p>The basic idea is that you take over planets (which produce more ships), and try to exterminate your opponent (which I'm moments away from doing).  The green planets are mine, the grey planets are neutral, and the orange is my opponent.  Here's a similar just-about-to-win screencapture from my version of the game:</p>
<p><a href="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/galconvict.png"><img class="aligncenter size-full wp-image-1504" title="galconvict" src="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/05/galconvict.png" alt="" width="650" height="510" /></a></p>
<p>Rather than have masses of ship icons, I grouped them into fleets (the hollow circles) with the number of ships indicated.  Notice that I also don't make my fleets go around other planets, as evidenced by the "26&#8243; fleet going through the "19&#8243; planet right in the middle.  Gameplay, however, is basically identical to the original.</p>
<p><span style="text-decoration: line-through;">I made the code available on GitHub: <a href="http://github.com/barneyb/GalConvict">http://github.com/barneyb/GalConvict</a>. It's hardly the Mona Lisa.  I made exactly zero effort to architect, refactor, etc.  It was just a proof of concept for playing with Processing.</span> [2010-05-17: This has been retracted - <a href="http://www.barneyb.com/barneyblog/2010/05/17/galconvict/">read why</a>]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2010/05/14/processing-and-galcon/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Polyglot Programming at cf.objective()</title>
		<link>http://www.barneyb.com/barneyblog/2010/04/23/polyglot-programming-at-cf-objective/</link>
		<comments>http://www.barneyb.com/barneyblog/2010/04/23/polyglot-programming-at-cf-objective/#comments</comments>
		<pubDate>Fri, 23 Apr 2010 19:00:10 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[cfml]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[groovy]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=1484</guid>
		<description><![CDATA[This afternoon I presented on Polyglot Programming at cf.objective() 2010.  Unlike most presentations I give, this one has almost no code, so the slidedeck (as a PDF) is the whole shebang.  The in-deck content is admittedly light; really just an outline to follow along as I talked.  The short version of the verbal part is:
Using [...]]]></description>
			<content:encoded><![CDATA[<p>This afternoon I presented on Polyglot Programming at cf.objective() 2010.  Unlike most presentations I give, this one has almost no code, so the <a href="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/04/Polyglot_Programming.pdf">slidedeck (as a PDF)</a> is the whole shebang.  The in-deck content is admittedly light; really just an outline to follow along as I talked.  The short version of the verbal part is:</p>
<blockquote><p>Using multiple languages has a bit of a learning curve but it pays off, and more quickly than you think.  Language selection and design is a vital aspect to being a successful developer, both for individual projects and as part of your continuing career.</p></blockquote>
<p>I'll probably give the presentation again on CFMeetup at some point this year, and maybe at a user group or two, so if you missed it all is not lost.  Unfortunately (or fortunately), cf.objective() is so content-rich that it's hard to get to every session you want, so if you missed it you can probably get a second chance.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2010/04/23/polyglot-programming-at-cf-objective/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Domain Model Integrity</title>
		<link>http://www.barneyb.com/barneyblog/2010/04/14/domain-model-integrity/</link>
		<comments>http://www.barneyb.com/barneyblog/2010/04/14/domain-model-integrity/#comments</comments>
		<pubDate>Wed, 14 Apr 2010 16:35:23 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[coldfusion]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[orm]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=1478</guid>
		<description><![CDATA[Unlike my last several posts, this one isn't ORM related.  At least not directly.  If you're using ORM, you necessarily care about your domain model's integrity, as it's a prerequisite for ORM doing it's job, but it has nothing to do with ORM specifically.  The point of a domain model is to be a representation [...]]]></description>
			<content:encoded><![CDATA[<p>Unlike my last several posts, this one isn't ORM related.  At least not directly.  If you're using ORM, you necessarily care about your domain model's integrity, as it's a prerequisite for ORM doing it's job, but it has nothing to do with ORM specifically.  The point of a domain model is to be a representation of your business rules and logic, and that means it needs to be internally consistent.</p>
<p>If you're building a SQL-heavy, procedural application, the database is probably the only place your domain model is represented.  But if you're building an object oriented application, your domain model will also be represented in memory as object graphs.  In almost all cases, a given object graph is only a small slice of your entire domain model, but it is a representation and must be kept consistent.</p>
<p>Here's an example of a very simple domain model consisting of Person and Pet classes, where a Pet has an owner (a Person), and a Person has a collection of Pets:</p>
<pre>component Person {
  property name="name" type="string";
  property name="pets" type="array[Pet]";
}

component Pet {
  property name="species" type="string";
  property name="owner" type="Person";
}</pre>
<p>Just to reiterate, these are NOT persistent types.  They're simple types for in-memory use only.</p>
<p>So what semantics does this model imply?  Or to rephrase, what invariants does this model carry?  The most important semantic is that the relationship between pets and their owners is expressed from both sides (both classes).  More explicitly stated, the domain model is structured such that if you have a Person you can get their Pets, and if you have a Pet, you can get their owner (a Person).  The implications of this is expressed in these two invariants:</p>
<pre>assert pet.owner is null || pet.owner.pets contains pet
assert person.pets.every { pet.owner == person }</pre>
<p>The first one states that if a Pet has an owner, that owner's "pets" collection must contain it.  The second one states that every Pet in a Person's "pets" must have that Person set as it's owner.  I'm trying to be really deliberate in spelling this out, because it's <em>really</em> important.</p>
<p>Just for a moment, let's take a detour to the relational database world.  If we were to express this domain model in the database we'd have a Person table and a Pet table, and the Pet table would have a foreign key (likely named 'owner_id') that references the Person table's primary key.  SQL allows us to traverse the relationship expressed by that foreign key in either direction, so both relationships (Person-&gt;Pet and Pet-&gt;Person) are expressed in a single place (the foreign key column).  Both directions are represented together.  A foreign key constraint (which all RDBMSes support), on the column is doing nothing more than instructing the database to enforce these invariants.  This is all second nature, and we don't even think about it when we use a database to represent our domain model.</p>
<p>Now back to the in-memory representation.  We still need to enforce these invariants, but in memory we have to deal with references (pointers), and references only point in one direction.  That's why we have to have both the 'pets' property (a Person's references to Pets) and the 'owner' property (a Pet's reference to a Person), but in the database we only need one foreign key column (Pet.owner_id).  The relationship between Person and Pet objects is actually expressed in a <strong>pair</strong> of references.</p>
<p>The problem with this arrangement is that you, in effect, double represent your relationships.  Both invariants must remain true, and since each invariant is represented by it's own reference in the model, you have to synchronize changes to those references.   When you set the owner of a Pet, you must also add that Pet to the owner's "pets" collection.  When you remove a Pet from a Person's "pets" collection, you must also remove the Pet's owner reference.  If you don't keep these in sync, one of your invariants will be false, and that means your domain model is in an invalid/inconsistent state.</p>
<p>When your domain model is in an invalid state, your application falls apart.  Every assumption you make in your application is suddenly unreliable, because they're predicated on your business rules, and your business rules are expressed through your domain model.  An invalid domain model means your business rules were violated, and anything you do from this point forward will be suspect.  I'm going to say it again: this is <strong>really</strong> important.  If your domain model is in an invalid state, your application has failed.  Period.  End of story.  Your only recourse is to revert it back to it's last known consistent state and throw away all pending operations.</p>
<p>What about relationships that are not bi-directional?  For example, perhaps your model has PrivateEye and Subject types.  Clearly the PrivateEye needs to know about his Subject, but it'd be kind of silly if the Subject knew about the PrivateEye.  In this case the relationship only moves one way, so there is only one reference (from PrivateEye-&gt;Subject), and <em>there is no invariant</em>.  When we put this in the database, however, we have exactly the same structure as the bi-directional Person&lt;-&gt;Pet relationship: a foreign key that can be traversed in two directions.  With the database representation of the model you <strong>can't</strong> express the concept of a one-directional relationship.  This is a powerful differentiator for in-memory models, since it gives you much finer control over the semantics of your model than a database could ever provide.</p>
<p>So where does this relate to ORM?  Just like everything else in your application, Hibernate depends on your invariants being true in order to persist your model to the database.  If they're not true, Hibernate can't hope to do it's job correctly.  A huge number of "problems" that people starting out with Hibernate face have nothing to do with Hibernate itself, but rather are caused by an invalid in-memory domain model.  Coming from the world of procedural, SQL-based persistence, you don't necessarily have to worry about an in-memory domain model's integrity, which means that you can write what amount to buggy applications where the bugs never manifest themselves.</p>
<p>Bottom line is that if you're using an in-memory domain model, you simply <strong>must</strong> ensure the invariants of that model remain true.  More specifically, you <strong>must</strong> set both sides of every bi-directional relationship.  If you don't, you're just asking for punishment, both from your software tooling and from users/clients of your application.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2010/04/14/domain-model-integrity/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Simple CSS Tabs</title>
		<link>http://www.barneyb.com/barneyblog/2010/03/30/simple-css-tabs/</link>
		<comments>http://www.barneyb.com/barneyblog/2010/03/30/simple-css-tabs/#comments</comments>
		<pubDate>Tue, 30 Mar 2010 18:28:35 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[css]]></category>
		<category><![CDATA[development]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=1397</guid>
		<description><![CDATA[I use tabs for navigation a lot.  Not for in-page DOM swapping, but for expressing a list of available pages along with indicating which on you're on.  Pretty much every tab "system" is centered around client-side manipulation, rather than just presenting server-generated markup.  And the few counter examples don't do it in an encapsulated way, [...]]]></description>
			<content:encoded><![CDATA[<p>I use tabs for navigation a lot.  Not for in-page DOM swapping, but for expressing a list of available pages along with indicating which on you're on.  Pretty much every tab "system" is centered around client-side manipulation, rather than just presenting server-generated markup.  And the few counter examples don't do it in an encapsulated way, they use a body selector, tab-id-specific styling, or whatever.  What I really wanted was to simply emit a UL with some LIs inside (one of them with class="active") and be done.  No muss, no fuss.</p>
<p>After faking it with ad hoc CSS in a number of apps I decided it was time to actually make a concerted effort to build a reusable mechanism for doing this.  So I did.</p>
<p><img class="aligncenter size-full wp-image-1399" title="css_tabs" src="http://www.barneyb.com/barneyblog/wp-content/uploads/2010/03/css_tabs.png" alt="" width="634" height="45" /></p>
<p>You can get the code as well as a demo and docs at <a href="http://www.barneyb.com/r/css_tabs.cfm">http://www.barneyb.com/r/css_tabs.cfm</a>.  Included is a form for customizing the colors and sizing of the tabs (since some of the values are used in multiple directives), and the CSS is emitted at the bottom (as well as into the document itself for styling the demos).</p>
<p>Note that this is <em>not</em> designed to be the be-all and end-all CSS tab solution.  It is designed to be a really lightweight and easy to use CSS tab solution.  Emphasis on simplicity.  Disemphasis on edge cases, backwards compatibility, and bells and whistles.  Firefox 3+, IE8+, and Chome all do fine.  I didn't test others.  IE7 and less fail spectacularly (though the navigation remains totally usable).  And mind your !DOCTYPE.</p>
<p>Finally, since I'm deliberately only styling the UL and LI, you can put whatever you want inside the LIs.  I'm already using this layout on a number of apps and while most have simple As inside the LIs, some of them are using icons and/or icons and text.</p>
<p>I don't know if that's useful to anyone else, but since it's been so helpful for me, I figured I'd share.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2010/03/30/simple-css-tabs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scaling Averages By Count</title>
		<link>http://www.barneyb.com/barneyblog/2010/03/02/scaling-averages-by-count/</link>
		<comments>http://www.barneyb.com/barneyblog/2010/03/02/scaling-averages-by-count/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 03:49:30 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[potd]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=1290</guid>
		<description><![CDATA[One of the problems with statistics is that they work really well when you have perfect data (and therefore don't really need to do statistics), but start falling apart when the real world rears it's ugly head and gives you data that isn't all smooth.  Consider a very specific case: you have items that people [...]]]></description>
			<content:encoded><![CDATA[<p>One of the problems with statistics is that they work really well when you have perfect data (and therefore don't really need to do statistics), but start falling apart when the real world rears it's ugly head and gives you data that isn't all smooth.  Consider a very specific case: you have items that people can rate and then you want to pull out the "favorite" items based on those ratings.  As a more concrete example, say you're Netflix and based on a person's movie ratings (from 1-5 stars), you want to identify their favorite actors (piggybacking the assumption that movies they like probably have actors they like).</p>
<p>This is a simple answer to derive: just average the ratings of every movie the actor was in, and whichever actors have the highest average are the favorites.  Here it is expressed here in SQL:</p>
<pre>select actor.name, avg(rating.stars) as avgRating
from actor
  inner join movie_actor on movie_actor.actorId = actor.id
  inner join movie on movie_actor.movieId = movie.id
  inner join rating on movie.id = rating.movieId
where rating.subscriberId = ? -- the ID of the subscriber whose favorite actors you want
group by actor.name
order by avgRating desc
</pre>
<p>The problem is that &#8211; as an example &#8211; Tom Hanks was in both Sleepless in Seattle and Saving Private Ryan.  Clearly those two movies appeal to different audiences, and it seems very reasonable that someone who saw both would like one far more than the other, regardless of whether or not they like Tom Hanks.  The next problem is if they've only seen one of those movies, the ratings are going to paint an unfair picture of Tom Hanks' appeal.  So how can we solve this?</p>
<p>The short answer is that we can't.  In order to solve it, we'd have to synthesize the missing data points, which isn't possible for obvious reasons.  However, we can make a guess based on other datapoints that we do have.  In particular, we know the average rating for all movies for a user, so we can bias "small" actor samples towards that overall average.  This will help mitigate the dramatic effect of outliers in small sample sizes when there aren't enough other datapoints to mitigate them.</p>
<p>In other words, instead of this: <img src='http://s.wordpress.com/latex.php?latex=%5Coverline%7Br%7D_%7Bactor%7D%5C%20%3D%5C%20avg%28rating_%7Bmovie_%7Bactor%7D%7D%29&#038;bg=T&#038;fg=000000&#038;s=1' alt='\overline{r}_{actor}\ =\ avg(rating_{movie_{actor}})' title='\overline{r}_{actor}\ =\ avg(rating_{movie_{actor}})' class='latex block' /></p>
<p>we can do something like this: <img src='http://s.wordpress.com/latex.php?latex=n%20%3D%20count%28rating_%7Bmovie_%7Bactor%7D%7D%29&#038;bg=T&#038;fg=000000&#038;s=1' alt='n = count(rating_{movie_{actor}})' title='n = count(rating_{movie_{actor}})' class='latex block' /> <img src='http://s.wordpress.com/latex.php?latex=%5Coverline%7Br%5E%5Cprime%7D_%7Bactor%7D%5C%20%3D%5C%20%5Cbar%7Br%7D_%7Bactor%7D%5C%20-%5C%20%5Cfrac%7B%28%5Cbar%7Br%7D_%7Bactor%7D%5C%20-%5C%20%5Cbar%7Br%7D_%7Boverall%7D%29%7D%7B1.15%5En%7D&#038;bg=T&#038;fg=000000&#038;s=2' alt='\overline{r^\prime}_{actor}\ =\ \bar{r}_{actor}\ -\ \frac{(\bar{r}_{actor}\ -\ \bar{r}_{overall})}{1.15^n}' title='\overline{r^\prime}_{actor}\ =\ \bar{r}_{actor}\ -\ \frac{(\bar{r}_{actor}\ -\ \bar{r}_{overall})}{1.15^n}' class='latex block' /></p>
<p>This simply takes the normal average from above, and "scoots" it towards the overall average based.  The denominator is a constant picked by me (more later) raised to the power equal to the number of samples we have.  This way as the number of samples goes up, the magnitude of the correction falls rapidly.  Here's a chart illustrating this (the x axis is a log scale):</p>
<p style="text-align: center;"><img class="aligncenter" title="Correction By Sample Count" src="http://chart.apis.google.com/chart?cht=lc&amp;chs=500x275&amp;chd=t:.8696,.7561,.5718,.3269,.1069,.0114,.000013&amp;chds=0,1&amp;chxt=x,y,x&amp;chxr=1,0,1,0.1&amp;chxl=0:|1|2|4|8|16|32|64|2:|samples&amp;chxp=2,50&amp;chtt=Correction+By+Sample+Count+(1.15+factor)&amp;chg=16.66,20,1,4" alt="" width="500" height="275" /></p>
<p>With only one sample, the per-actor average will be scooted 87% of the way towards the overall average.  With four samples the correction will be only 57%, and by the time you get 32 samples there will be only a 1% shift.  Note that those percentages are of the distance to the overall average, not any absolute value change.  So if a one-sample actor happens to be only 0.5 stars away from the overall average, the net correction will be 0.465.  However, if a different one-sample actor is 1.5 stars away from the overall average, the net correction will be 1.305.</p>
<p>Of course, I'm not Netflix, so my data was from PotD, but the concept is the identical.  The "1.15&#8243; factor was derived based on testing on the PotD dataset, and demonstrated an appropriate falloff as the sample size increased.  Here's a sample of the data, showing both uncorrected and corrected averages ratings, along with pre- and post-correction rankings:</p>
<table>
<thead>
<tr>
<th>Model</th>
<th>Samples</th>
<th>Average</th>
<th>Corr. Average</th>
<th>Rank</th>
<th>Corr. Rank</th>
</tr>
</thead>
<tbody>
<tr>
<td>#566</td>
<td>22</td>
<td>4.1818</td>
<td>4.1310</td>
<td>46</td>
<td>1</td>
</tr>
<tr>
<td>#375</td>
<td>12</td>
<td>4.1667</td>
<td>3.9640</td>
<td>47</td>
<td>2</td>
</tr>
<tr>
<td>#404</td>
<td>13</td>
<td>4.0000</td>
<td>3.8509</td>
<td>81</td>
<td>3</td>
</tr>
<tr>
<td>#1044</td>
<td>7</td>
<td>4.2857</td>
<td>3.8334</td>
<td>44</td>
<td>4</td>
</tr>
<tr>
<td>#564</td>
<td>5</td>
<td>4.4000</td>
<td>3.7450</td>
<td>42</td>
<td>5</td>
</tr>
<tr>
<td>#33</td>
<td>32</td>
<td>3.7500</td>
<td>3.7424</td>
<td>176</td>
<td>6</td>
</tr>
<tr>
<td>#954</td>
<td>4</td>
<td>4.5000</td>
<td>3.6895</td>
<td>40</td>
<td>7</td>
</tr>
<tr>
<td>#733</td>
<td>4</td>
<td>4.5000</td>
<td>3.6895</td>
<td>39</td>
<td>8</td>
</tr>
<tr>
<td>#330</td>
<td>7</td>
<td>4.0000</td>
<td>3.6551</td>
<td>74</td>
<td>9</td>
</tr>
<tr>
<td>#293</td>
<td>5</td>
<td>4.2000</td>
<td>3.6444</td>
<td>45</td>
<td>10</td>
</tr>
</tbody>
</table>
<p>In particular, model #33 sees a huge jump upward because of the number of samples.  You can't see it here, but the top 37 models using the simple average are all models with a single sample (a 5-star rating), which is obviously not a real indicator.  Their corrected average is 3.3391, so not far off the leaderboard, but appreciably lower than those who have consistently received high ratings.</p>
<p>For different size sets (both overall, and expected number of ratings per actor/model) the factor will need to be adjusted.  It must remain strictly greater than one, and is theoretically unbounded on the other end but there is obviously a practical/reasonable limit.</p>
<p>Is this a good correction?  Hard to say.  It seems to work reasonably well with my PotD dataset (both as a whole, and segmented various ways), and it makes reasonable logical sense too.  The point really is that if you don't care about correctness, you can do some interesting fudging of your data to help it be useful in ways that it couldn't otherwise be.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2010/03/02/scaling-averages-by-count/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Front Controllers Should NOT Extend Application.cfc</title>
		<link>http://www.barneyb.com/barneyblog/2010/02/12/applicationcfc-extends-front-controller-is-evil/</link>
		<comments>http://www.barneyb.com/barneyblog/2010/02/12/applicationcfc-extends-front-controller-is-evil/#comments</comments>
		<pubDate>Fri, 12 Feb 2010 22:59:25 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[cfml]]></category>
		<category><![CDATA[development]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=1221</guid>
		<description><![CDATA[I've been playing with FW/1 a bit on a personal app, and it has proven incredibly frustrating due to multiple manifestations of a single problem: your Application.cfc HAS to extend the framework in order to use the framework.  My complaint really has nothing to do with FW/1 in particular, the exact same argument could be [...]]]></description>
			<content:encoded><![CDATA[<p>I've been playing with <a href="http://fw1.riaforge.org/">FW/1</a> a bit on a personal app, and it has proven incredibly frustrating due to multiple manifestations of a single problem: your Application.cfc HAS to extend the framework in order to use the framework.  My complaint really has nothing to do with FW/1 in particular, the exact same argument could be made against <a href="http://fusebox.org/">Fusebox</a>'s Application.cfc integration (but FB at least provides a "normal" way to use it).  And just to be clear, I'm also not railing against <a href="http://www.corfield.org/">Sean Corfield</a>, even though he happens to be the author of both FW/1 and Fusebox's Application.cfc integration.</p>
<p>The first problem is due to Adobe's seemingly mindless choice to require the use of Application.cfc for per-request settings (datasource, mappings, ORM config, etc.), rather than doing it the right way with tags (in the style of the CFAPPLICATION tag).  With Application.cfc being the only place you can define any of this, you cannot use Application.cfc for an individual frontend, since it has to be shared across ALL frontends.</p>
<p>Consider a prototypical blog.  It has a public side (where the public reads), and an admin side (where authors write).  Two separate front ends; one single application.  If your front controller is bound to Application.cfc, you're forced to either run two separate applications or a single dual-purpose frontend.  Either one is a mess, either reducing encapsulation or increasing duplication.  At the very least.</p>
<p>Now consider a different example: an app with a single frontend that also needs one, single, solitary, standalone page for something.  Maybe even just for a quick one-off test script.  You create 'test.cfm' in your directory (so it gets the proper Application.cfc context so you can do your ORM magic), and hit it in the browser.  Oops, your framework decided with it's onRequest handling that it's going to just do it's thing, completely ignoring your template.  Different manifestation, same problem, though this one can be mostly addressed by overriding onRequest with custom behaviour that conditionally invokes super.onRequest.</p>
<p>Like the majority of places where inheritance is used, the proper solution is composition.  Rather than having your Application.cfc extend the framework, let your application compose the framework into itself.  That way it happens on the application's terms, rather than the framework's terms.  I understand that just extending the framework is desirable for ease of initial setup, so I'm not saying you can't do that, just you (as a framework author) should provide an alternative (like Fusebox's fusebox5.cfm).  Then I (as the application developer) can decide how the framework should be used.</p>
<p>Just to be clear, the problem with using inheritance where composition is the correct choice rests on both the shoulders of Adobe (for making request settings part of Application.cfc, rather than composed in with tags), and framework authors (for requiring Application.cfc to extend the framework).  Addressing either of these problems would handle the first manifestation (paragraph 3), but only the framework authors can deal with the second manifestation (paragraph 4).</p>
<p>Bottom line, don't wire yourself in so you can self-invoke, let me invoke you when and where I want you.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2010/02/12/applicationcfc-extends-front-controller-is-evil/feed/</wfw:commentRss>
		<slash:comments>34</slash:comments>
		</item>
		<item>
		<title>First Person Documentation</title>
		<link>http://www.barneyb.com/barneyblog/2009/06/02/first-person-documentation/</link>
		<comments>http://www.barneyb.com/barneyblog/2009/06/02/first-person-documentation/#comments</comments>
		<pubDate>Tue, 02 Jun 2009 21:49:56 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[personal]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=990</guid>
		<description><![CDATA[I'm not sure when I started, but I've documented things in the first person for quite a while.  Fusedocs promoted this format, and was probably a significant influence, though I recall doing it back in college as well.  It's clearly not new or uncommon, but I just had a gentleman email me about it (based [...]]]></description>
			<content:encoded><![CDATA[<p>I'm not sure when I started, but I've documented things in the first person for quite a while.  Fusedocs promoted this format, and was probably a significant influence, though I recall doing it back in college as well.  It's clearly not new or uncommon, but I just had a gentleman email me about it (based on finding my old <a href="http://www.barneyb.com/barneyblog/projects/combobox/">ComboBox</a> script embedded in an app he inherited), and his comments really brought to light why I like the format so much.  Here's an example (from the ComboBox code):</p>
<pre>/**
 * I am called to repopulate the dropdown.  There should never be a
 * need to invoke me externally.
 */
ComboBox.prototype.populateDropdown = function() {
  // ...
}</pre>
<p>Now you might say that there is no substantial difference between that and the more typical third-persion documentation.  However, I think there's a huge difference.</p>
<p>First, in order to anthropomorphize the method enough to write in first person, you have to get inside what it's doing.  That provides useful insight, and can highlight behaviour that really doesn't make that much sense, even though it doesn't jump out and say "I'm wrong!".  It also makes the comments easier to write, I think, because it's more personal.</p>
<p>Second, writing in the first person lets you make certain statements in a far nicer way.  For example, there's no way to say "Barney implemented this method poorly", despite the fact that it might be true for whatever reason (misunderstandings, time constraints, etc.).  However, saying "I am implemented poorly " doesn't assign blame.  Even better is "I was poorly implemented by Barney", which would never be written by anyone except Barney, and therefore also prevents assignment of blame, but does let Barney take it upon himself.  Hopefully any of the three are immediately followed by how it was implemented poorly, why it was done that way, and even suggestions for improvement.</p>
<p>The inverse of this is it lends a bit of personality to the code.  Seemingly arbitrary business requirements tend to end up with comments that have a bit of "attitude" to them, for example.  This personality is absolutely instilled by the developer(s) themselves, but it becomes an element of the code itself which makes it far easier to deal with.  It supplies a sort of contextual memory for the code when it's cracked open again in a different time.</p>
<p>Finally, the mindset of the reader (even if it's the same person who wrote it) is different when interfacing with something first hand.  The first person gives you the impression that the code is alive and talking to you, and as we all know, code is a living thing.  The third persion is like what you'd see in a museum; information about a static snapshot of something dead.</p>
<p>I'd be quite interested to hear about other's experiences with this style of commenting, good or bad.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2009/06/02/first-person-documentation/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Shoot the Engineers</title>
		<link>http://www.barneyb.com/barneyblog/2009/05/26/shoot-the-engineers/</link>
		<comments>http://www.barneyb.com/barneyblog/2009/05/26/shoot-the-engineers/#comments</comments>
		<pubDate>Tue, 26 May 2009 18:47:43 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[cfml]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[personal]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=988</guid>
		<description><![CDATA[About a week ago, Marc Funaro wrote an interesting blog post about CFML and OO.  The prevailing opinion (via Twitter, blogs, etc) is that Marc is incorrect/inaccurate/inexperienced/whatever, and I disagree completely.  He hit the nail on the head.
HTTP is a stateless, request-response environment.  Nearly all web applications interface with a SQL database, which is also [...]]]></description>
			<content:encoded><![CDATA[<p>About a week ago, <a href="http://www.advantexllc.com/blog/">Marc Funaro</a> wrote an <a href="http://www.advantexllc.com/blog/post.cfm/how-oo-almost-destroyed-my-business">interesting blog post about CFML and OO</a>.  The prevailing opinion (via Twitter, blogs, etc) is that Marc is incorrect/inaccurate/inexperienced/whatever, and I disagree completely.  He hit the nail on the head.</p>
<p>HTTP is a stateless, request-response environment.  Nearly all web applications interface with a SQL database, which is also a predominantly stateless request-response environment.  Those are orthogonal to the core OO principle of interacting stateful objects.  It's far closer to the FP (functional programming) paradigm, but particularly on the SQL side, still doesn't match completely.</p>
<p>To use OO in a SQL-backed web app, you hide the mismatches with ORM, an object-based Front Controller implementation, and session facades.  As Marc points out, Java works pretty well in this paradigm for two main reasons:  Java is crazy fast, and Java developers have invested ridiculous amounts of effort in tooling to support this model.  CF has neither of these advantages.  I'm not belittling the effort poured into various frameworks (Fusebox, Model-Glue, ColdSpring, Transfer, etc.), just that they are significantly behind what is available to Java developers.</p>
<p>Unlike Marc, I happen to think that a Front Controller framework is essential, but I don't use a OO one for exactly the reasons he outlines.  I build <a href="http://www.barneyb.com/barneyblog/projects/fb3-lite/">FB3lite</a> for just this purpose: 70 lines of straightforward procedural code that help me enormously with certain common tasks.  I often masquerade my apps as standalone pages with mod_rewrite (converting /viewUser.html into /index.cfm?do=viewUser), but that's a cheat.</p>
<p>I also use CFCs  and ColdSpring for my business tier, but no object (domain) model for me.  The CFCs are really just glorified function libraries that I can use ColdSpring's AOP engine to wrap transactions around without having to manage them explicitly in my code.  In order to get the AOP I have to use CFCs, and I like the namespacing they provide (so I can have a 'doThing' method in multiple namespaces without conflict), but there is no real OO-ness there.</p>
<p>I know what you're saying.</p>
<p>Yes, I often preach the benefits of OO and encourage people to learn about it and use it.  But using a howitzer to hunt mice in your garage is not a clever idea.  If I'm writing Java (or Groovy), I'm going to use OO structures, but that's because of the programming environment.  I am a pragmatic person.  I like to learn about a wide array of tools and then use as few of them as possible, knowing that there are other options available if I need them.</p>
<p>Yes, built CFGroovy with Hibernate support so I could use ORM in my CFML apps via Groovy objects.  It provides the best of both worlds, the speed and tooling of Java in a CFML environment.  That approach works quite well, but if I don't need the complexity, I'm not going to do it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2009/05/26/shoot-the-engineers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>My First cf.objective()</title>
		<link>http://www.barneyb.com/barneyblog/2009/05/26/my-first-cfobjective/</link>
		<comments>http://www.barneyb.com/barneyblog/2009/05/26/my-first-cfobjective/#comments</comments>
		<pubDate>Tue, 26 May 2009 15:12:54 +0000</pubDate>
		<dc:creator>barneyb</dc:creator>
				<category><![CDATA[cfml]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[personal]]></category>

		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=986</guid>
		<description><![CDATA[I know I'm late to the "cf.objective() recap" party, but I've been both crazy busy and rather tired, so I haven't got to it until now.
First, I'd never been to Minneapolis before, and from the little I saw, it's a pretty nice place.  Obviously I missed the "buried under snow" part, and that definitely puts [...]]]></description>
			<content:encoded><![CDATA[<p>I know I'm late to the "<a href="http://cfobjective.com/">cf.objective()</a> recap" party, but I've been both crazy busy and rather tired, so I haven't got to it until now.</p>
<p>First, I'd never been to Minneapolis before, and from the little I saw, it's a pretty nice place.  Obviously I missed the "buried under snow" part, and that definitely puts a damper it as a potential home, but I liked it.  Very walkable, clean, and aside from the second-story causeways between the buildings, a nice asthetic overall.  The hotel was in a great spot, with a pretty varied selection of dining an drinking establishments within easy walking distance.</p>
<p>Before I got there, I hadn't quite internalized how small a 200-person conference actually is.  "Social" is a skill I didn't inherit from my father, unfortunately, but with the number of people I knew already, I didn't feel nearly as isolated as I often do at CFUNITED (which is five times the size).</p>
<p>The sessions were pretty good, over all.  I didn't get to go to several that I would have liked to because of scheduling, but c'est la vie.  Here's a rundown of the notable ones I attended:</p>
<p>Adobe's keynote the first day was interesting, and I might be mixing it in with some other Adobe presentations, but quite fascinating to see crowd reaction to certain Centaur features.  CFFINALLY and CFCONTINUE?  Nothing.  It should be noted that I was the only one to applaud them last year at CFUNITED.  Remote diff of server configuration?  Huge applause.  WTF?!?  Script your production environments, people.  If they're ever out of sync, <em>you're doing your job <strong>wrong</strong></em>.   ORM stuff got much applause, of course, and rightfully so.  Drag and drop, full-stack scaffolding also did.  Do people actually use that?  Great marketing/sales tool, no question, but for actual applications?!?!  But I digress&#8230;</p>
<p>Marc Escher's talk on unit testing was quite interesting, I thought.  I've tried numerous times, with numerous technologies, to really embrace unit testing and failed every time.  Actually had the best luck doing it with Flex, which just drips with irony.  I'm not predicting success next time I attempt it, but I'm confident I'll do better than last time.  On a similar vein, Sean Corfield's talk on cf.spec provided some nice pointers.  I'm not too sure about the "readable" spec document concept, but an interesting technique.  Until you can have exactly one spec document, I'm not sure of the utility, but I think that's really an editor/syntax problem, not a conceptual one.</p>
<p>Mark Mandel's intro to Transfer was quite interesting as well.  That I attended might surprise you if you're familiar with the various Hibernate projects I've worked on, but ORM is still voodoo in my mind.  Coming back to the basics and being "introduced" to ORM from the ground up is always interesting, because the subtleties in interpretation provide a great introspective of ORM as a whole.  The odds of me picking up Transfer and using it on a "real" project are pretty small, but I didn't go to learn about Transfer in particular, more about ORM in general.</p>
<p>Let me be clear on this, Transfer is amazing.  It does things with CFML that I would have sworn were impossible, and does them fast enough to be perfectly servicable.  It's just not the tool that fits my style.  I've been a Hibernate user for many years, and that's a hard framework to supercede.  Honestly, I bet I'll never replace Hibernate with another ORM solution, but instead replace it with an alternate approach (an object database, for example).</p>
<p>As you might expect, I also went to Adobe's talks about the new ORM functionality coming in Centaur.  When I first was exposed to their Hibernate implementation, I was pretty skeptical.  There seemed to be a global misunderstanding of both the technology and the problem it was design to solve, but that has turned around 180 degrees, and Centaur looks to have pretty robust ORM capabilities.  I've got a major bone to pick with how Adobe is <em>marketing</em> the functionality, but the actual implementation looks pretty sound.  It's hard to get a complete picture with the pre-release secrecy, but I'm a lot more excited about it than I was 6 months ago.</p>
<p>Even more exciting is what will hopefully be coming out of Railo/JBoss in the coming months.  There's been no formal talk of what that looks like yet, and it's probably a safe bet that it'll be similar to ColdFusion's implementation (for obvious reasons), but with Railo and Hibernate both under the JBoss umbrella, I think there's some cool stuff on the way.  Obviously any speculation is just that, but with Railo supplanting ColdFusion in a lot of places I use CFML, I'm understandably excited about it.</p>
<p>The last session was Adam Haskell's talk on mentoring and code review.  That is a misnomer of a title, if you ask me, because while he did talk about that stuff, the point was really about team dynamics.  Working on a team is hard.  Working in a "get things done" environment only makes it worse.  Fostering the team, particularly around helping junior developers move up in the world, takes time and effort, but it's worth it.  I think Adam did a good job of emphasising that any sort of formal process is less effective than an equivalent informal process.  Informal is inherently more personal, and with the typically sterile world of technology and computers, the "personal" stuff is really important.</p>
<p>Of course, the big draw of any conference (though the hardest to justify) is the meals/drinks/etc. that happen outside the actual conference.  (It seems like I just said the same thing two sentences in a row, completely by accident.)  With the conference as small as it was, the informal socialization was a lot tighter, I though.  Far less spreading of groups, and so more churn within them.</p>
<p>I also really liked the way they did lunch, with actual table service, rather than a buffet.  With a more formal meal, you end up sitting and talking with fewer people, but for a longer period of time.  Aside from fostering more involved conversation, it also provides a nice break from the chaos to resettle for the afternoon.</p>
<p>Talking with other developers is always fun, and typically the source of the best tidbits of information.  I always like learning about stuff, even if it has no direct applicability, because it gets you mind thinking in ways it otherwise wouldn't.  And it seems to happen pretty often that 6 months down the road one of those random bits of information suddenly because fairly relevant.  Maybe not directly, but at least opens my eyes to some potential approach I wouldn't have otherwise considered.</p>
<p>Great conference, overall.  I gotta hand it to Jared and his team.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.barneyb.com/barneyblog/2009/05/26/my-first-cfobjective/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
