cdave: (Default)
cdave ([personal profile] cdave) wrote2008-07-29 09:27 am
Entry tags:

Google killer this 'aint (yet)

I don't remember the first search engine I used, probably Yahoo, or Lycos, but I remember the first I switched to. Hotbot. It didn't proclaim that it reached more of the web than the others, or that it was faster, but it let you use keywords like NOT or AND in your queries, so you could filter the results and get more relevant ones.

This week there's a new kid on the search engine block

Rather than rely on superficial popularity metrics, Cuil searches for and ranks pages based on their content and relevance.


This is such '90s fallacy. No-one cares if you have the deepest search there is. Internet search engines are all about relevance; Returning the best first page possible.

I've just tried out a not very scientific search on my name, and Cool Cuil fails on several accounts.

Not using popularity tests means that the first 5 pages consist of dozens of pages an artist selling prints in many online shops. Yes these pages use his name more than I do, but that just leads to spammers creating pages full key words. This isn't helped by the fact that there doesn't seem to be a way to exclude words. Otherwise it would be easy to filter out results with "poster" or "print".

I had to scroll through to page 6 to find anything that wasn't a shop page. But by then the spammers had started to creep into the result.

The policy of catching all websites seems to have had an interesting effect. Google never managed to cache all of my old blogger site, so I had a look for that. They don't seem to have it at all. But they do have a whole bunch of spammers who have copied the text from my site. Neat I didn't know I was so popular.

I thought I'd try and see how many there were. Filtering out the other sites that may accidently match my query. Searching on the page title (without any punctuation), which is also the first words on the page, and was included in the results returned ... nothing.
What?
It's right there on the page. 90% of the results Cuil already showed me had that exact text. How can they not find any matches?

It's not the size of your cache. It's what you do with it that counts.
andrewducker: (Default)

[personal profile] andrewducker 2008-07-29 09:07 am (UTC)(link)
Same here. search for "andrew ducker" on google and you'll find my journal instantly. Do it on cuil and the results suck.

[identity profile] licenced.livejournal.com 2008-07-29 10:30 am (UTC)(link)
Yeah, it ain't great.

Aside from occasionally showing porn instead of a relevant image, the search results are poor at best.

Often you'll just get a ton of pages linking to something that mentions the subject you're interested in rather than any links to what you actually wanted (try 'Cerillion' - if I type in a company name I want their homepage as the first link usually as I'm too lazy to type it directly in case I make the .com/.co.uk/.org/.net mix-up first time).

Same goes for my name - pages 1 and 2 contain not much of relevance, then from page 3 a load of stuff that links to my blog, rather than an actual link to my blog.