permalink.gif 2003-08-19

permalink.gif Experienced classifier wanted

Tue Aug 19 20:18:41 BST 2003  Permalink 

One question bubbles up in my mind reading about Bayesian classifiers.  They all seem to be naive.  So, what does an experienced Bayesian classifier look like?

permalink.gif No messing with JavaScript

Tue Aug 19 20:16:51 BST 2003  Permalink 

For anyone doing Javascript work I can highly recommend both David Flanagan's JavaScript: The Definitive Guide and Danny Goodman's JavaScript & DHTML Cookbook.  The guide is a pretty comprehensive reference that I turn to a lot.  The cookbook is a very interesting mix of practical JavaScript and DHTML techniques.  I'm particularly interested in the possibilities for drag & drop in web interfaces.  Marc Barrot uses this to good effect in WebOutliner.

permalink.gif Bayes makes topics

Tue Aug 19 19:57:08 BST 2003  Permalink 

Doing a bit of digging into Bayesian filtering.  Although it may have other uses in K-Collector later on, my initial thoughts are using a Bayesian filter to automatically suggest topics for weblog posts via K-Collector client.  If it's good enough we might even be able to skip the user having to approve some topics.

At the moment the client is using a simple keyword stemmer which is effective, up to a point, but suggests a lot of false positives.   I'm hoping that a trained Bayesian classifier will do a lot better.  Of course this raises the issue of how it gets trained but that is a bridge of a very different colour.

Some resources I have come across:

permalink.gif Moneyball

Tue Aug 19 18:56:56 BST 2003  Permalink 

Recommended Listening

I'm a long-time baseball fan, and I can appreciate the dissection of statistics to get an advantage in any situation. Michael Lewis' Moneyball: The Art of Winning an Unfair Game takes both these passions of mine and wraps them into a great story about how an underfunded team can compete with anyone, even the deep-pocket New York Yankees by choosing players and salaries wisely.

[RatcliffeBlog: Business, Technology & Investing]

I'll second that;  Moneyball is a very cool book which I found hard to put down.  I really enjoyed Bob Costas book 'Fair Ball' as well.

More about:

permalink.gif Asphalt River

Tue Aug 19 13:54:37 BST 2003  Permalink 

Walking on Asphalt River.

Yesterday afternoon, I stood in my driveway and looked at the black asphalt road passing by my house.  I asked myself what if the asphalt was liquid and the road was a river.  What a river it is, flowing everywhere people live.  Six degrees of separation pale in comparison.

I imagined wading into the river up to my neck which awarded me with a smile.  Not a bad return for playing with imagination a little.  After spending a few more minutes picturing myself swimming in the road, I moved on to what I set out to do.  I walked onto the asphalt road.  Ahhh, there!  A flicker of amazement visits me.  I am walking on water!  With that, I walked safely back to my driveway with an even bigger smile.

There are things that amaze us, but amazement itself is entirely our own making.  What saddens me is how fast amazement fades into mundane.  Things, places, people, understanding -- nothing escapes, all fading like photos left out in sunlight -- flowers, mountains, Ferrari, Walkman, campfire, my son's little toes, all becomes mundane eventually.

So I am left with cheap thrills like the one I pulled yesterday.  I am still amazed with how stupid human mind is, but I am sure that will fade too.

I won't be able to blog for the next two days.  Find your own supply of amazements meanwhile.

[Don Park's Daily Habit]

permalink.gif Better byte code managament

Tue Aug 19 12:32:17 BST 2003  Permalink 

ASM is a Java byte code engineering library similar to Apache BCEL.  The claimed advantages are that is is much smaller and faster than BCEL.  The figures given were 21K vs 350K and average overhead of 60% vs. ~700%.  Much of this appears to be down to ASM's decision not to represent the class via an object model which may have other usability tradeoff's.  However with performance in this range ASM may be a good fit for applications which require dynamic modification/generation of classes.
More about:

permalink.gif Crime is bad Reverend

Tue Aug 19 10:01:29 BST 2003  Permalink 

Jeffrey Hicks sent me a comment in response to my recent post thinking about introducing Bayesian filtering into K-Collector.  He pointed me at an application he is demoing called Reverend.  It's combines a Bayesian filter with the WordNet lexical classifier to allow people to train it about whether words are good or bad.

The result is that it can tell you that crime is bad and, by inference, that bribery, racketeering, larceny and theft are bad too!

Somewhat reminiscent of some of the Cyc inference examples.