You know How much SEO (Search Engine Optimization) is Important for the Indexing and Ranking in the Search Engines. And there are many ways to Optimize your Content Indexing process and to Rank well in Search Engines. One of them is to optimize your Robots.txt File, it’s a General Text File (Found in the Root directory of any Website or Blog) for Search Engines Bots or Spiders. Because if it’s not well Configured or Optimized then it may cause some critical issues like You’re writing for a long but even not a piece of your content is Indexed and that’s why you’re not getting Traffic or many other reasons.
So, your Robots.txt File should be well Configured and Optimized to let Search Engine Bots know what to Crawl and Index. And to do so, you Robots.txt File should be well written like this one (Only, If you’re using WordPress).
As I’ve already told you that SEO is very Important for the Websites and Blogs to get some Traffic or Business. Simply we do Search Engine Optimization but actually, we don’t Optimize the Search Engines instead We Optimizes Our Content for Search Engines. And there a plenty of methods to Optimize or to go Close to the Search Engines and the Methods get Births Every Seconds.
Why is Robots.txt File Important for SEO :
Robots.txt File is very Important for SEO because it tells the Search Engine Bots that which part of the Website or Blog is to Crawl or Index and Which part or directory is to Avoid.
Simply, We can say that it’s a file which gives Instructions to Search Engine Bots to take Action with the help of some Action Classes or Words, commons are:-
User-agent: Allow: Disallow: Sitemap:
So, Today we’re going to Optimize a Robots.txt File (found in the Root Directory) for Search Engines bots. If your website or blog have one then you can optimize it as I’m going to show you in this Step-by-Step Optimization Guide. And if your Website or Blog don’t have a Robots.txt File then you can Create one by following simple Steps (Scroll down for Steps).
How to Create an SEO Optimized Robots.txt File :
If your website or blog don’t have a Robots.txt File then let’s create it with SEO Optimization in Mind. As you know that It’s a general Text File, which can be created by any Text Editor like Notepad. But if you’re using WordPress then you should definitely use WordPress SEO by Yoast Plugin to Create and Edit your Robots.txt File.
Definitions of some common
User-agent: User-agent means the Name of Search Engine Bot.
Allow: Allow command tells the Search Engine Bot that Hey! you’re are allowed to Crawl and Index this Address.
Disallow: Disallow command tells the Search Engine Bot that Hey! you don’t have permission to Crawl and Index this Address.
Sitemap: Sitemap command tells the Search Engine Bot that Hey! here is my Website or Blog’s Sitemap, Please crawl it too.
It contains some written Commands/Instructions to take an Actions for Search Engines Bots like:-
User-agent: Googlebot Disallow: /cgin-bin
Instructions written above just means to Google-Bot, saying that Hey! Google-Bot you don’t have permission to crawl and Index any part of this directory on Google. So, if you write instruction in your Robots.txt File then Google-Bot will not Crawl and Index this Directory on Google.
So, one thing is Cleared here that when we need to restrict any Search Engine Bot to Crawl and Index our web pages or Directories to avoid Duplication of Content, We use
User-agent means the name of Search Engine Bot. If you want a Directory to not to be Crawled and Indexed by any Search Engine Bot then you can set the
User-agent command to apply to all Search Engine bot (web crawlers) by listing an asterisk (
*) as showing below:
And, if you want to stop crawling of a particular Search Engine Bot then you can replace asterisk (
*) by the name of Bot.
Now, the big problem is that Where to find the Name of Search Engine Bots. No need to worry, you can find the list of Names of Most Search Engine Bots Here.
What you should or not (Right Way – Wrong Way) :
To write any comment in the Robots.txt File. Start writing with “#” character, all content after the start of a comment until the end of the record is treated as a comment and ignored. White-space at the beginning and at the end of the record is ignored. (Content Line Credit – Google Developer)
User-agent: Googlebot News #Name of Crawler of Google News Allow: /indianews/ #India News Directory
User-agent: Googlebot News <!-- Name of Crawler of Google News--> Allow: /indianews/ -India News Directory Disallow: /newsgraphics/ <-Graphics Directory
Spaces at the beginning of Commanding word is not recommended by Google and a space just after the colon is optional but Google says that it’s Recommended to improve Readability.
User-agent: Googlebot Allow: /blog/
User-agent:Googlebot Allow:/blog/ Disallow: /24-02-2015
Don’t change the Order of writing Commands in Robots.txt, otherwise it may misbehave.
User-agent: Googlebot Allow: /
Disallow: / User-agent: Googlebot
If you want not to be Crawled and Indexed more than One Directory then don’t write them along with each other.
User-agent: Googlebot Disallow: /cgi-bin/ Disallow: /strock/ Disallow: /summ/
User-agent: Googlebot Disallow: /cgi-bin/strock/summ/ #Search Engine Bot will be Confused
And here is the Sample of WordPress Robots.txt File, any WordPress user can use it (Just make sure that you’ve replaced example.com with your URL in Sitemap).
User-agent: * Disallow: /wp-admin/ Disallow: /cgi-bin/ Disallow: /recommends/ Disallow: /comments/feed/ Disallow: /trackback/ Disallow: /index.php Disallow: /xmlrpc.php Disallow: /wp-content/plugins/ User-agent: NinjaBot Allow: / User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: duggmirror Disallow: / User-agent: Googlebot-Mobile Allow: / Sitemap: http://www.howtocracker.com/post-sitemap.xml Sitemap: http://www.howtocracker.com/page-sitemap.xml
So, you’ve Created or Edited your Robots.txt File. Either you can use FTP Software to upload your Robots.txt File in the Root Directory of your Website or you can use your cPanel > File Manager to Upload it.
Learn more :-
Google – More Official Information by Google.
JohnChow.com – How to Kill DuggMirror
Have a Good Rank. Take Care.