You do realize, of course, that desktop search tools are going to be the new toolbars, right?
It’s not going to be horribly long before every two bit player puts together their own, adds some new widgety gizmo, and flaunts it as the latest cool app. Thing is, doing desktop search really isn’t that hard. The difficult part is coming up with the disk space to hold the index, but with todays massive drives, even that’s not a big problem (How many folks actually fill all 100GBs of storage?). The algorithm is fairly simple and well understood, and unlike the Internet, you (personally) don’t generate nearly the amount of content in a given day.
Plus, some of the search mechanisms are just silly. Are they really running my pictures through an imbedded OCR package looking for sign text or doing a lyric search on the various MP3’s i have stored? i hope so, because i’d love to see the full lyrics to some of the Kleptone tracks i’ve got stored. Chances are, no. They’re just recording the name and whatever bits of meta information the file was tagged with at creation. i suppose that’s great if i want to do a search for all pictures with the comment “Photo by JR Conlin <unitedHeroes.net>” since that’s the EXIF comment i set my camera up to add. Unfortunately, none of the big search engines allow me to do that sort of search across the web where it might really be useful so i can see who’s using my images.
To be honest, no, i really have little intention of running any of the desktop search tools because none of them address what i see as being the larger problem.
You see, no matter how organized i may be with my own systems, data will still get lost. i have two personal web sites, three others i help out with from time to time, 17 personal MySQL databases ranging from log analyzers to this blog with various levels of content, an entire disk full of source code (c/c++/Perl/smalltalk), my work computers (a UNIX platform and a windows client, plus a number of back-end systems that are tied to them), my Palm, my phone and shared drive on my wife’s laptop. This, of course, doesn’t include the various floppies and burned CDs i have of archived material. All tolled, i calculate that i have about half a terabyte of information i’ve spread out that i would like to be able to index.
Desktop search? Ha! For me, that’s like being able to rapidly find things in one folder of my filing cabinet.
To be fair, i am fairly well organized about a good deal of the information i’ve scattered about. Considering the volume of it, i have to be.
But, honestly, all of these things leave me completely underwhelmed.
What i want is a secure, centralized server (it has to be accessible from anywhere) that receives updates from the various devices, web sites and other locations of interest. It stores thumbnails and EXIF information along with some image analysis to determine what it is the picture might be of, and really does try to get the lyrics of the various MP3s it’s storing. It scans (or even better, has hooks to) the various databases i’ve added to it so that it can do live queries. My databases change more often than the static content, and are also more likely to be compressed and externally unindexable, and yeah, i want it to be cross platform.
Until the day that some company actually puts a real tool like that together, i’ll pass on playing with the toys. Well, that or build something using find, strings, Perl, mysql, and an Apache front-end.