Search billions of records on Ancestry.com
   
Spacer Spacer
Spacer

Search Engines and Directories

Part Seven
Spacer
Spacer
Introduction
Getting Organized
Tuning the Head
Tuning the Body
The Welcome Mat
PageRank and Links
Dubious Tricks
Submitting Your Site
The Top 21
Other Resources

Danger! Warning!!

Serious Disclaimer:
Some of these tricks will get you in serious trouble. And I do mean serious! Search engines may ban you permanently if you try some of these stunts. In addition, the IRS will audit you, your spouse will leave you, and your dog will drop dead on the spot. Don't say that I didn't warn you.

Multiple Titles

What's in a title? Well, quite a lot, if you believe certain search engines. Some of them will grant more importance to words in your title than they will to any other words on your site. However, the same search engines will only read the first 15 words from your title.

This can be a big problem if you have lots of keywords. To solve the problem, some people came up with the idea of putting multiple titles on their pages. The search engines, being only stupid robots, read the first 15 words from each title, then counted them all. However, they would only show one title on their search page, so the result still looked good.

However, search engines don't stay stupid forever. Most have now been programmed to look for this trick. If they find it, your whole site gets thrown out of the game.

Invisible Body Text

It can be really hard to get all of your keywords in the body of your page without sounding stupid. It's even worse if you use a lot of graphics on your front page. Several methods have been devised to put text on the page where only search engines will be affected by it.

Tiny Text

Some enterprising soul started putting a list of keywords in very small print - font size="1". The search engines couldn't tell the difference, but most humans ignored it. This worked in the early days, but most search engines have now been programmed to ignore anything that's too small for most people to read. Anyway, it looks ugly.

Text Color

The next method of hiding text was to make it the same color as the background. This had the added advantage that the human visitors couldn't see it. However, this one was really easy to program into a search engine.

The next variation on this one is to make the text only slightly different from the background color. Therefore, if your background is pure white (#ffffff), then make your text light grey (#fcfcfc). This wasn't too hard to program either, so it will probably get you caught as well.

It would be better to use a background image of the correct color. You could make a pure white background graphic, then set your background color to black and your hidden text to white. This will look fine to any robot that's spying on the colors, but will be invisible to humans. So far as I know, nobody's caught on to this one yet. That doesn't mean that you should go out and do it, though. In fact, forget that I said this.

Other people have taken to disguising their text in the stylesheet. You can set the font color to the background color, or set the font height to zero. Search engines don't request style sheets (at least not yet) so this will probably work for a while. However, there are ways of spotting this sort of thing without even looking at the stylesheet, so you're likely to get caught anyway. Google seems to prefer to use technological tricks to find spammers, and they usually succeed.

Forms Data

Most search engines will read the data that's being passed to a form. However, some forms tags are deliberately hidden (such as your email address being passed to a mail forms program.) This gives another opportunity to hide keywords - just include a

<input type="hidden" value="I'm going to hide this from you">

and you're all set.

This is acceptable if you need that variable to make your form work. But too many forms on a site will look suspicious, as will too many hidden tags in a single form. Best to avoid trying this trick as well.

Doorway Pages

Some people built special pages designed to be attractive to search engines, then linked these "doorway pages" to their regular pages. The doorway pages had very little content and lots of keywords. This trick doesn't work anymore, so don't even think of it.

Broken Code

This clever trick was based on the behaviour of the browsers in use at that time: they would refuse to display a table that had a missing </table> tag. However, the search engines were less picky, and read the whole page.

The individual who came up with this scheme was trying to get listed for the most popular search term of the day (no, not that one.) He tried everything, but his page just wouldn't make it. He started looking at the source code for the top pages the search engines were listing, and noticed that one of them used a complex table. He copied this entire table onto his page, then deleted the closing tags from the table. His page then shot up to the top of the list.

More recent versions of Internet Explorer would display this broken table, so this trick is of limited usefulness today. Besides, the person that you steal the table from may object to the copyright violation. Given today's buisiness climate, their displeasure is likely to involve lawyers. Bad idea.

Stealth Code

In order for this trick to make sense, I first need to explain a bit of the communication between your computer and the server. When you request your browser to open a page, there's a bit of back-and-forth between your computer and the server. In addition to the page that you want to see, your browser sends some basic information about you to the server. Nothing personal; just an operating system and browser identification.

For example, my computer just identified itself to the server with

"Mozilla/4.0 (compatible; MSIE 5.0; Windows 2000) Opera 6.05 [en]"
This tells the server that I am running Windows 2000, and that my browser is Opera 6.05 (among other things). The original intent of this information was to tell the server what the requestor might need. For example, if the browser is Lynx, don't bother sending any images (Lynx is text-only.)

Some of the high-end web sites use this information to send one page to Netscape-derived browsers and a different one to IE. This eliminates the compromises needed to make one page look good in all browsers. However, there's another use for this technology.

A search engine might identify itself with

"Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
which tells me that Google's spider (named googlebot) has come calling. Now, a clever programmer can detect this information any time a page is requested. After all, that's what they are doing to separate IE and Netscape browsers. It's then a simple matter to deliver a different page to the search engine. You can even have a separate page for each search engine, each page being custom tuned to that search engine.

Needless to say, the search engine folks don't like this practice. Some of them will send a stealth spider out to check on your pages. The stealth spider will identify itself as a normal computer running a standard browser. They then compare the results from the stealth and the regular spiders. If the two are different, you're out. I've heard of buisiness sites having to buy a new domain after pulling this one; the old site was permanently banned. An expensive lesson in business ethics.

Link Farms

With the popularity of Google's PageRank system, people began looking for ways to bias it in their favor. Since Google likes links, they put up web sites that were nothing but lists of links. This was easy for Google to block out, so some of the smarter ones added some content to each of the pages. For example, I could put a list of links at the bottom of each of these pages, then sell the links to companies. Well, no. But you get the idea.

The search engines will figure this one out soon enough. That's why I recommend that you put your links in the text, in the most relevant place. Pages with only lists of links, or with long lists at the bottom, will probably be ignored.

There is some evidence to believe that search engines will also ignore pages that have the word "links" prominently displayed. That probably means that I've just destroyed the possibility of using this page as a link farm. Drat. Anyway, if you want to have a links page, try not to use the "L" word anywhere on it. Call it "Our Friends" or something like that. Get creative.

Back to submitting your site.



Questions or comments about this page? Email the author.


Last Modified: 17 May 2004
Copyright © 2000-2004 James C. Keebaugh
All Rights Reserved
Spacer