Yep .. This is just the first cut at the entire system, so there are a lot of issues like this to work out. So basically it's basically prototyping that AllMusic and LimeWire integration. Next up is to make stuff nicer. Deduping is a really hard problem in general though ... Nobody ever tags correctly, and most likely the hash code that's computed for you tune is going to be different than that of the one it's downloading.

For some background, there are a couple layers to Serendipity:

First is a Ranker -- This is basically the thing that can give a list of the most highly ranked tunes on your Empeg (some algorithm that decides what types of things you really like or don't like -- basically the data set that it will base recommendations off of). The current implementation of a ranker is RandomEmpegRanker -- basically it just picks random tunes off your Empeg. Clearly there are smarter implementations of this in the future.

Next is the Recommender -- This takes your Ranked tunes and uses some algorithm to provide recommendations of other tunes that should be downloaded (actually it returns metadata about tunes to download, so that could just be "all songs by Artist X" or a specific tune. Currently it recommends based only on artist by ripping data from AllMusic.com . This could be layered with other recommenders that basically take a set of recommended metadata and run it through a filter to remove things you already have to address some of the problems you're seeing.

Third is a MusicProvider -- This is basically a place it can go to search for tunes that match the criteria returned by the recommender and download them. Right now is pretty blindly goes out to LimeWire (gnutella) and searches for String format of the metadata with "mp3" attached to the end and just downloads up to a certain number of matches for each recommendation. This is another place the filter could enter in (i.e. try to figure out if you already have this tune -- probably a variation on the Ranker that the provider passes matches back through the ranker so the ranker can decide whether that tune should be downloaded or not -- right now it doesn't do that stage).

Clearly each of the stages is the first draft, but the basic system is here. One of the things I'd like to add in is a rating system like TiVo has so it can be smarter about what it tries to download -- number of plays is a pretty good way to do that now, probably.

Mike