We are living in an age where robots andspiders are crawling all over your Web site. No, this isn't a tag linefrom an old 1950 horror movie, this is the way things are. Don't befrightened though. The fact that you have robots and spiders on yourWeb site is a good thing. A very good thing if you care about your ownsuccess Online. How can you make the most out of the robots andspiders? it all starts with a little file called "robots.txt". Before I get into what the robots.txt file is all about, there issomething I have to cover. If you have been around the proverbial Webmaster block a few times, you have heard about search engine spiders.They are small "robots" that search engines send out across theInternet to look for content. Just about every major search engine usesthem.
Now let us start with what it is. The robots.txt file is a small textfile that sits in your root directory. When search engines send outspiders to roam the Internet looking for content to pick up, they readthe robots.txt file first. Think of it as your way to talk directly tothe search engines.
This is how your Web site ends up on a search engine, like Google. Whenyou "submit" your Web site to a search engine you are putting yourdomain on a list of Web sites for them to spider over. Now which isbest? Is it better for the search engine spider to find you by itselfor with you submitting yourself ot the search engine? There is debatefor both sides, so I will not get too deep into that.
So you know now that a search engine sends out spiders to pick upcontent on the Internet. You know you can talk to the spider withincluding something within your robots.txt file in your root directory.Now comes the fun stuff.
Now that you have a robots.txt file in your root directory, you canfigure out what you want to tell the search engine spiders. This time,"Hey, how you doing?!" isn't going to cut it. You have to learn how tospeak their language. First there is the User-agent code. TheUser-agent code specifies the specific search engine you wish to speakto. Each search engine spider has a name. For example, Google's searchengine spider's name is "googlebot". Other search engines have othernames.
Here is a good Web site to check out if you are curious about what names certain search engines are using.
Search Engine Dictionary - Spider Names
The Web Robots Database

To use the User-agent code to call for a specific search engine spiders to read, do this:
User-agent: googlebot
This tells Google's spider that you want them to follow the rules you set in your robots.txt file.
To use the User-agent code to call for all search engine spiders to read, do this:
User-agent: *
This tells all search engine spiders that you want them to follow the rules you set in your robots.txt file.
Now instead of telling them, "It is okay for you to get content fromhere, here and here" it is much easier to tell the spiders where not togo. That is where the robots.txt file is most helpful. That is wherethe Disallow: command comes in handy. Using it, you can tell a searchengine spider not to get anything inside your "photos" folder for anexample. How does it look in the robots.txt file?
Lets make this command for all search engines spiders to stay out of my "photos" folder located in my root directory.
User-agent: *
Disallow: /photos/
That is it! Now I don't have to worry about any search enginespiders looking inside my "photos" folder and indexing what is inside.The thing to remember is to keep your paths relative to where your rootdirectory is. What does that mean?
If your domain name is (Mitchkeeler.com) then in the aboveexample, I just told the spider to stay out of my folder here(Mitchkeeler.com Photos). If my "photos" folder was insidemy "images" folder (Mitchkeeler.com-images-photos, then theabove example wouldn't have worked.
It would have had to of been changed to this:
User-agent: *
Disallow: /images/photos/
For an example of what a robots.txt file looks like in action, letstake a trip to the White House! Here is the White House's Web site'srobots.txt file:
Whitehouse.gov Robots.txt.
Now you are ready to get into your own favorite text editor andcreate a robots.txt file for yourself. If you still have any questionsfeel free to shoot me an E-mail or check out these handy links:
Search Engine World - Robots.txt Exclusion Standard Information.
Robotstxt.org.
Related Articles
- Design an Online Chat Room with PHP and MySQLIn this article, you will learn how to design and develop a simple online chat room with PHP and MySQL. This tutorial explains every steps of the development, including both database design and PHP programming. Basic computer skills and knowledge of HTML and PHP are required. Ok, lets begin now. ....
- Five Proven Ways To Promote Your Web Site1. Holding a contest or sweepstakes is an proven way to promote your web site. You can announce your site to hundreds of web sites that list free contests and sweepstakes. Send out a press releases about your contest or sweepstakes...
- Role Of Spider SEONO I am not talking about the eight-legged hairy creatures of nightmares. These spiders are electronic, used by search engines, and if you ignore them they give you nightmares that will haunt you every waking moment
- Role Of Spider in SEOSearch Engine Optimization (SEO) is all about common sense; it doesnt necessitate specific knowledge of algorithms, programming and classifications. All it requires is a basic understanding of how search engines spiders work. Taking care of the search engine spiders likes and dislikes can prove to....
- New SEO Term SpibalanceI been doing SEO for a very long time now so I thought of helping webmasters with a very common issue. The problem is that most webmasters see their site jumping in and out of the indexed. Today you check and your website is in the index.
- Strategies to Keep Your Customers Visiting Your WebsiteJust like in the real world retaining your customers is a key to business success. The more visits your site gets the more chances are that someone will buy your product or services. As a result its important that your provide tools and strategies that will retain your customers, keep them coming ba...
- SEO Blunders - Four Mistakes to AvoidThere are lots of articles giving you top tips about SEO. Ive actually written a few myself.
- 10 Tips How To Skyrocket Your Sales1. End your slow sales periods by planning ahead. Plan to add extra bonuses, hold a sale or package your product with other products. For example, you could say, Buy one, get one free! Another example would be, Buy over $20 worth of products and get 5 free bonuses!
- Hot Site Promotion Tips for Search EnginesYou may be wondering what exactly is promotion for search engines. Well it is marketing to put your site in a position that is favorable to search engine spiders. By doing this you gain a more favorable position in search engines that is reflected in your sites ranking and positioning.
- Feeding Internet Spiders with Site MapsA site map is an effective tool to improve your search engine position. See how you can give those spiders-bots the best possible treat.
