<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Data Mining With Weka</title>
	<atom:link href="http://www.barneyb.com/barneyblog/2008/04/24/data-mining-with-weka/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.barneyb.com/barneyblog/2008/04/24/data-mining-with-weka/</link>
	<description>Thoughts, rants, and even some code from the mind of Barney Boisvert.</description>
	<lastBuildDate>Thu, 11 Sep 2014 09:58:12 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Dan</title>
		<link>https://www.barneyb.com/barneyblog/2008/04/24/data-mining-with-weka/comment-page-1/#comment-198470</link>
		<dc:creator>Dan</dc:creator>
		<pubDate>Sat, 05 Dec 2009 21:49:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=384#comment-198470</guid>
		<description>The clustering algorithm Weka uses is defined in the command you use, in the case above it is the k-means clustering.
http://en.wikipedia.org/wiki/K-means_clustering#Standard_algorithm

The Expectation Maximisation (EM) clustering is &quot;better&quot; and as a plus you don&#039;t have to guess/provide the number of clusters as it can figure this out using cross-validation (internally runs multiple times and picks the number of clusters which resulted in the highest expectation).</description>
		<content:encoded><![CDATA[<p>The clustering algorithm Weka uses is defined in the command you use, in the case above it is the k-means clustering.<br />
<a href="http://en.wikipedia.org/wiki/K-means_clustering#Standard_algorithm" rel="nofollow">http://en.wikipedia.org/wiki/K-means_clustering#Standard_algorithm</a></p>
<p>The Expectation Maximisation (EM) clustering is "better" and as a plus you don't have to guess/provide the number of clusters as it can figure this out using cross-validation (internally runs multiple times and picks the number of clusters which resulted in the highest expectation).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Weka Mining Update: It Works! at BarneyBlog</title>
		<link>https://www.barneyb.com/barneyblog/2008/04/24/data-mining-with-weka/comment-page-1/#comment-108711</link>
		<dc:creator>Weka Mining Update: It Works! at BarneyBlog</dc:creator>
		<pubDate>Thu, 24 Jul 2008 21:08:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=384#comment-108711</guid>
		<description>[...] in April, I posted about how I was using Weka to do some asset prioritization.Â  The gist of it was that users would rank assets on a 1-5 scale, and then then the system would [...]</description>
		<content:encoded><![CDATA[<p>[...] in April, I posted about how I was using Weka to do some asset prioritization.Â  The gist of it was that users would rank assets on a 1-5 scale, and then then the system would [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: barneyb</title>
		<link>https://www.barneyb.com/barneyblog/2008/04/24/data-mining-with-weka/comment-page-1/#comment-99303</link>
		<dc:creator>barneyb</dc:creator>
		<pubDate>Tue, 24 Jun 2008 18:48:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=384#comment-99303</guid>
		<description>Jim,

It&#039;s hard to say.  There has been a slight skewing downward overall, but from some manual checks that I&#039;ve done against asset similarity (e.g. manually figuring out what cluster an unranked asset probably belongs to), it&#039;s reasonable.  To put that another way, prioritized assets average a slightly higher rank than randomized assets, which is just what you&#039;d expect.  I haven&#039;t done any actual math to quantify the skew, but only because what I&#039;m seeing seems reasonable.</description>
		<content:encoded><![CDATA[<p>Jim,</p>
<p>It's hard to say.  There has been a slight skewing downward overall, but from some manual checks that I've done against asset similarity (e.g. manually figuring out what cluster an unranked asset probably belongs to), it's reasonable.  To put that another way, prioritized assets average a slightly higher rank than randomized assets, which is just what you'd expect.  I haven't done any actual math to quantify the skew, but only because what I'm seeing seems reasonable.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jim</title>
		<link>https://www.barneyb.com/barneyblog/2008/04/24/data-mining-with-weka/comment-page-1/#comment-99298</link>
		<dc:creator>Jim</dc:creator>
		<pubDate>Tue, 24 Jun 2008 18:16:36 +0000</pubDate>
		<guid isPermaLink="false">http://www.barneyb.com/barneyblog/?p=384#comment-99298</guid>
		<description>Hi Barney,

How did your test of seeding unranked items go?  Did it skew the results at all?  

Can you post some of your finding?

Thanks,

--j</description>
		<content:encoded><![CDATA[<p>Hi Barney,</p>
<p>How did your test of seeding unranked items go?  Did it skew the results at all?  </p>
<p>Can you post some of your finding?</p>
<p>Thanks,</p>
<p>&#8211;j</p>
]]></content:encoded>
	</item>
</channel>
</rss>
