Search Conversation from the Future

Dick: Did you run that 37Signals SearchSniffr tool against the site? What does that thing do anyway?”
Jane: It takes terms from our logs and related sites’ logs and uses them in search queries, and then generates a report. I have it here somewhere… here it is. The Sniffr Sensitivity is 87, which is too high. We’re still getting too many empty results and not enough best bets.
Dick: So what do we do?
Jane: We use the Synonym Makr and Zipfr to fine tune things until we get that Sensitivity score down below 60.”
Dick: Nice.

Yahoo and Google have made search the dominant finding paradigm out there, but once you arrive at a particular website the search is usually much less useful. It’s time we had better tools to make search great everywhere. Lou and Richard are addressing the ideas behind this and opening up a rich area for innovation.

Google word clustering and the UI plunge

John Battelle reports on Google’s recent demo of word clustering algorithms, which could force them to take the UI plunge:

What do I mean by that? Well, of all the major engines, only Google has strictly maintained what might be called the C prompt interface to search: put in yer command, get out yer list of results (Google Local is a departure, but it’s still in beta). Yahoo, Ask, A9 and others have begun to twiddle in pretty significant ways with evolved interfaces which – by employing your search history, your personal data, clustering, and other tricks – deliver more filtered and intentional results (though it is still arguable if they are more relevant).

Link courtesy of Alex.

Search method seeds

Over on the Asilomar email list I started exploring a method of adding search to a site that has grown to need it. The point is to get a pretty good idea of what should be done to arrive at good results in a logical way, rather than just install a search engine and improve through trial and error. With the help of Jonothan and Iwan here’s seeds of a method I’ve collected so far:

Let’s assume we’re working with conventional search engine technology, e.g. mostly relevancy-based.

We have to begin either at the point the content is going into an index or the user’s goals, and because I’m user-centered boy I’ll start with the users. We can learn about their goals, things like do they want precision or recall, what are their most popular search terms, how many queries do users submit in a session, do users repeat queries over multiple sessions, how do the queries change over time, and so on.

Using the user goals we can construct a strawman user interface.

Next we look at the content. We can ask what format is it in, how much is there, how will the volume change over time, for dynamic content will the search index rebuild anew or cumulatively grow, how clean is it (ROT), how often does it change, and so on.

With the strawman UI and a rough idea of what the index looks like we can simulate some search tasks and the result sets. We might then consider how metadata and tweaks to the ranking algorithm could help.

At some point we have to install the search engine, index the content, and try some queries. Then we might use a systematic approach to tweaking results:

Too many results? Try cleaning or otherwise reducing content, changing weighting, change the ranking algorithm, or use a more restrictive search form UI (e.g. more fields that must be selected)

Too few results? Try adding synonyms or a less restrictive search form UI. This might also be a sign that you don’t in fact need a search engine.

Is accuracy bad? Change the algorithm, change the weighting, add metadata, use best bets, improve the search form UI.

Users can’t fulfill goals even if results are good? Try improving the results UI.

I’m sure I’m missing stuff, but it’s a start.

Iconic Search

James showed me this gorgeous visual search interface he and Designframe created for a company that makes ‘metal fabrics’ – sheets of flexible, woven metal designs. Aimed at architects, the results page is key too. First off, the color photos pop nicely on a black background. Second, instead of prominently showing the product that matches a query, it shows off a structure built with it, corresponding with the goals of the architect, not the goal of the manufacturer; the product is off to the side in a smaller way. It’s all about lusting after beautiful stuff, which architects like to do. It probably helps when the products are all shiny metal objects.


Hapax‘s FindEngine is a search engine system somewhat like Autonomy, but better according to some accounts, due to its unique use of computational linguistic analysis. They have an interesting way of displaying results/answers:

Rather than delivering results in the form of links to documents that you then have to read to verify, FindEngine™ delivers answers in response to queries. The answers are returned in the form of sentences extracted from their original source…

There’s a feel-good profile of the founder in Brainheart magazine.


In the left column you’ll notice I reinstated the LombardiSearch interface for searching the web. The point is to more easily choose the right search engine for a query based on the engine’s particular search method. In this case I’m using Teoma, Google, and AskJeeves.

Thanks again to Mathew for the Javascript assist.

Browse and Search To Become Old-Fashioned Distinctions?

Gene Smith’s comments on browse vs. search makes me think the difference between search and browse is melting away, it’s a matter of using the right metadata to describe the content so even the search is basically an automated browse…bouncing from topic to topic to extract the right information.

As a scenario, it’ll be like when they do research by talking to the computer on Star Trek, circa Next Generation.

Cheerleaders and Search Engines

Looking at my referers I see people often find their way here by searching Google for ‘noise where I’m currently result #9. While this is like warm massage oil on my ego, it confirms what I used to rant about (see March 12) that popular doesn’t necessarily equal better (or more relevant). Does a person searching for ‘noise‘ really want my content? I doubt it.

The same search on Teoma actually brings up a bunch of pages about noise: noise pollution, the Institute of Noise Control, noise and hearing loss…those are quality results. And while this is an isolated example, the difference is striking. For pure relevancy, I’m starting to look into Teoma.

Bridget’s Google Hacking

Bridget is experimenting with how to improve Google’s ranking of a particular site. A little bombing and tweaking of the title tag did wonders.

Historical note: years ago I submitted this blog to Bridget’s portal and she was nice enough to note me in her blog, and the link popularity spread from there. So perhaps I can help return the favor by helping her cause: santa cruz real estate :)