Results 1 to 10 of 10

Thread: Robot.txt

  1. #1
    Join Date
    Oct 2011
    Posts
    116

    Robot.txt

    What is Robots.txt and use the steps for robots.txt?

  2. #2
    Join Date
    May 2012
    Posts
    60
    Robots.txt files helps search engine spiders how to interact with indexing your content.

    By default search engines are greedy. They want to index as much high quality information as they can, and they will assume that they can crawl everything unless you tell them otherwise.

    If you specify data for all bots (*) and data for a specific bot (like GoogleBot) then the specific bot commands will be followed while that engine ignores the global/default bot commands.

    If you make a global command that you want to apply to a specific bot and you have other specific rules for that bot then you need to put those global commands in the section for that bot as well, as highlighted in this article by Ann Smarty.

    When you block URLs from being indexed in Google via robots.txt they may still show those pages as URL only listings in their search results. A better solution for completely blocking the index of a particular page is to use a robots noindex meta tag on a per page bases. You can tell them to not index a page, or to not index a page and to not follow outbound links by inserting either of the following code bits in the HTML head of your document that you do not want indexed.
    <meta name="robots" content="noindex">
    <meta name="robots" content="noindex,nofollow">
    Please note that if you block the search engines in robots.txt and via the meta tags then they may never get to crawl the page to see the meta tags, so the URL may still appear in the search results URL only.

    If you do not have a robots.txt file, your server logs will return 404 errors whenever a bot tries to access your robots.txt file. You can upload a blank text file named robots.txt in the root of your site (ie: seobook.com/robots.txt) if you want to stop getting 404 errors, but do not want to offer any specific commands for bots.

    Some search engines allow you to specify the address of an XML Sitemap in your robots.txt file, but if your site is well structured with a clean link structure you should not need to create an XML sitemap

  3. #3
    Join Date
    Jan 2012
    Posts
    225
    Hi in my opinion your url is online and visitors seeing your webpage but can not crawling google....

  4. #4
    Join Date
    Jan 2012
    Posts
    30
    When a search engine crawler comes to your site, it will look for a special file on your site. That file is called robots.txt and it tells the search engine spider, which Web pages of website should be indexed and which should be ignored.

  5. #5
    Join Date
    May 2012
    Posts
    68
    Robots.txt is a very helpful file that tells the search engine spiders that which URL to crawl and index and which not.



    Water Dispenser Manufacturers | Water Dispenser Suppliers | Water Dispenser India

  6. #6
    Join Date
    Oct 2011
    Posts
    185
    robot.txt is nothing but a very useful thing by which you can prevent search engine to crawl the links which you don't want. You can disallow any search engine to crawl any of your broken link or dead link. It can be done by webmaster tool.

  7. #7
    Join Date
    Mar 2012
    Location
    Calgary, Canada
    Posts
    48
    robot.txt is a text file which every webmaster put in root directory of any website to send a clear message to crawl bot that which page you don't want to crawl or index by crawler.

  8. #8
    Join Date
    Mar 2011
    Location
    Bangalore
    Posts
    127
    Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.

  9. #9
    Join Date
    Jul 2012
    Location
    USA
    Posts
    16
    Robots.txt is a text file created by webmasters to tell search engine robots that how to crawl and index pages on their website.

  10. #10
    Join Date
    May 2012
    Posts
    105
    Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do.

Similar Threads

  1. Using robots.txt effectively
    By EarthWorm in forum Web Design, HTML Reference and CSS
    Replies: 0
    Last Post: 02 Mar 2010, 02:23 PM
  2. ad robot that appear in posts
    By nemo65 in forum WDF Suggestions
    Replies: 1
    Last Post: 14 Nov 2007, 01:04 PM
  3. robots.txt usage and examples
    By apachedude in forum Search Engine Optimization and Marketing
    Replies: 0
    Last Post: 07 Feb 2007, 04:00 AM
  4. Robot to skip code
    By mufasa in forum Search Engine Optimization and Marketing
    Replies: 1
    Last Post: 12 Oct 2005, 06:02 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •