Weka Mining Update: It Works!

Back in April, I posted about how I was using Weka to do some asset prioritization.  The gist of it was that users would rank assets on a 1-5 scale, and then then the system would recommend other assets that it thought they'd like.  This is the Netflix problem, if you're familiar with that, though without the time component.

Over the intervening months, I've gone back and tuned the mechanism slightly.  The algorithm hasn't undergone any fundamental changes, just made a few little tweaks here and there.  A cluster of related assets doesn't tell you which one a given person wants, but if you can figure out which cluster they like the best, other assets from that cluster are probably desirable.  The specifics of ranking the clusters and selecting a best fit is where most of those tweaks were made, not to the Weka implementation itself.

Here's a chart that show average rank and total ranks broken down by month:

As you can see, before the Weka switchover, the average rank for assets was  hovering right around 3.2.  That trend actually goes back a couple more years, I'm just not showing it here.  Once I started using Weka, that has gone up significantly, and is now around 3.7.  That's a pretty significant increase, especially when you consider that some asset selection remains totally random to help seed the asset pool.

Comments are closed.