Thursday, July 16, 2009

Practical uses of n-euclidean distance

One thing I really hate is tech blogs, or programming blogs for that matter, that talk about a given subject domain - throw some code on the screen and that's it for the discussion.

It doesn't really explain the practicality of their opinion or even their intelligence. I've seen some very messy code written by some very intelligent folks (myself included) so, just because you can splatter some pixels in the shape of keywords doesn't mean that you really know what you're talking about.

Most importantly - talking is not communicating. It only becomes communication when the reader understands what you're blathering on about. Just ask my wife - I communicate very poorly A LOT of the time. ;-)

Anywho -

I thought I would share some examples of using the Scheme code I tossed up there the other day in some practical manners. So, if you don't recall, just scroll down and re-read the previous post.

If you're following along with a PocketPC with Scheme then please leave a comment and we'll talk. Seriously.

You can also follow along with PLT Scheme just remember to put it in "R5RS" mode before you start.

Example 1:

You're looking at the map for a place to eat. You've been driving all day, (perhaps on a road trip between Halifax and Toronto) and the map shows that there is a Jack Astors and a Moxies both within a reasonable amount of distance.

But how close are they? Well, let's say you have one of those maps that has a row of numbers and a row of letters. Replace the letters with numbers, such as A = 1, B = 2, etc and you now have a map with Cartesian notation and you are ready to use the n-euclidean function I wrote for you (yes, just for you) to find out which one is closer.

Remember, the higher the number in this function, the less the distance between you and the object. (Counter intuitive but there you have it).

So, here we go:


(define you (list 12 12)) ;you are at 12, 12 on the map
(define moxies (list 4 3))
(define jack-astors (list 23 20))

> (n-euclidean jack-astors you)
0.07352146220938077

> (n-euclidean moxies you)
0.08304547985373997
So, based on this it appears that Moxies is closer to you than Jack Astors by 0.0095240176443592 units on your map.

So, next up is a more personal type of usage. Let's say that my friend has been raving about this new online movie critic that knows her stuff and really can pick good movies. Now, fortunately she's been posting for a while and has been going through a list of older titles and giving her reviews on them as well so I'm able to compile a list of movies that I think are brilliant and ones that I think suck eggs.

Here is my responses to the following titles:

Lost Horizon 5
Big Trouble Lt. China 4
Matrix 5
American Beauty 4
LA Confidential 4
F&L in Las Vegas 5
English Patient 1
Far and Away 1
Donnie Darko 4
Good Will Hunting 5

and the critic's ratings:


Lost Horizon 5
Big Trouble Lt. China 2
Matrix 4
American Beauty 5
LA Confidential 5
F&L in Las Vegas 1
English Patient 5
Far and Away 5
Donnie Darko 2
Good Will Hunting 5

and for good measure the scores from Rotten-Tomatoes.com (score is divided by 20 to maintain a 1-5 range):
Lost Horizon 5
Big Trouble Lt. China 4.15
Matrix 4.3
American Beauty 4.45
LA Confidential 4.95
F&L in Las Vegas 2.4
English Patient 4.2
Far and Away 2.4
Donnie Darko 4.15
Good Will Hunting 4.85

So, with all the data - we're ready to find out which critic matches my mindset the most.


(define me (list 5 4 5 4 4 5 1 1 4 5))
(define online-critic (list 5 2 4 5 5 1 5 5 2 5))
(define rotten-tomatoes (list 5 4.15 4.3 4.45 4.95 2.4 4.2 2.4 4.15 4.85))

> (n-euclidean me online-critic)
0.13018891098082389

> (n-euclidean me rotten-tomatoes)
0.2202060992539127
So, it would appear that I should stick with picking my reviews from Rotten Tomatoes and ignore this new would be darling of the online movie criticism space.

The last example is somewhat contrived but, gives an indication on some workplace usages for these types of criteria judging systems.

Let's say that the program manager over at Acme Widgets has hired me to create a system which will help him pair up his programmers into XP teams in such a way that the programmers don't kill each other or devolve into hostile silence with zero productivity.

So, I generate a list of touchy-feely questions such as:
  • I prefer rainy days to sunny days
  • I prefer cats to dogs
  • One must create the plan for the application before starting
  • Blue is a nice colour
And so on, for about 19 questions. The users are to rate the 'truth' of the statement on a scale of 1 to 10 with 1 being "I totally disagree" and 10 being "Amen brother - preach it".

So, let's see what we get:

(define joe-javascript (list 1 2 4 3 2 1 1 2 4 5 6 9 7 6 4 5 6 1 1 1))
(define robert-ruby (list 3 4 6 5 4 3 3 4 6 7 0 0 0 0 6 7 0 3 3 3))
(define lucius-lisp (list 1 2 2 9 9 8 7 9 8 8 7 8 9 0 8 8 7 7 8 7))
(define charlie-c (list 2 3 5 4 3 2 2 3 5 6 3 5 4 3 5 6 3 2 2 2))

> (n-euclidean joe-javascript robert-ruby)
0.057928444636349226
> (n-euclidean joe-javascript lucius-lisp)
0.047836487323493986
> (n-euclidean joe-javascript charlie-c)
0.12216944435630522
> (n-euclidean robert-ruby lucius-lisp)
0.047619047619047616
> (n-euclidean robert-ruby charlie-c)
0.10976425998969035
> (n-euclidean lucius-lisp charlie-c)
0.052999894000318
So, let's piece this together. At first look, it appears that the best match is Joe and Charlie with a ~.12216 but, the side effect of pair them would mean that Lucius and Robert would have to work together and that gives us the lowest score of the set with a ~.0476

This leads us to the logical conclusion that the best pairing is going to be Joe and Robert and Lucius and Charlie. Charlie seems to be the most balanced of the group, getting along with everyone except for Lucius.

No comments:

Post a Comment