You work hard, contact so many professional A list bloggers out there, and suggest your post for incoming link. If they provide you with a backlink, you will be happy. A high PageRank, unpaid link, given with the free will of the giver, from the most relevant page/category is going to be a million dollar vote to your page. It can itself get your page to skyrocket from SERP 400 to SERP 3.
Yesterday, I had a discussion in the Digital Point Forums about backlinks and their validity. It seems that most people are not knowledgeable about backlink validity analysis. People know only about DoFollow and NoFollow, and nothing above that. Here, we will see the importance of ‘robots.txt’ file in the backlink analysis. Robots.txt is a simple link invalidation secret several professional bloggers won’t share with you.
What Is Robots.txt
When a search crawler accesses a website for searching and indexing, the first thing it looks for is the Robots.txt file. If it doesn’t find one, it goes about normally indexing the page.
A robots.txt file is a small text file on the root directory of the domain of a web page that directs search engine crawlers as to which of the pages or sections of the website shouldn’t be indexed. Its general format is thus:
User-agent: Googlebot
Disallow: /links.html
The above code simply disallows the Google bot from accessing the links page of a website (whose relative path is /links.html). This means, if you are an advertiser and you exchanged links with this person, who has disallowed the links page, your backlink from that page holds no value whatever. But it willdefinitely look like a DoFollow backlink to the untrained eyes. The search bots will normally index any pages not mentioned in the Robots.txt file.
User-agent: *
Disallow: /page/categories.htm
This code disallows the page for all search bots, not only Google’s. The wildcard, ‘*’ is used to specify that all search bots are to follow the rule. So, none of the search bots will index the mentioned page.
A careful, clever advertiser thinks from the search bot point of view and first looks for the Robots.txt file of the pages he plans to purchase links from. He merely won’t purchase backlink from any disallowed page.
Checking Robots.txt File of Any Website
You can easily check the Robots.txt file of any website out there. It’s right there in the root directory of the domain. It is named that way, ‘robots.txt’. You may just access it through the browser. For instance, if you need to access the Robots.txt file of Microsoft.com, just go to the main home page of Microsoft, and put this on the address bar:
“http://www.microsoft.com/robots.txt”
Google: “www.google.com/robots.txt”
Simple? You will notice that these sites have disallowed a lot of internal pages from being indexed. This is why these internal backlinks do not show up in search results or hold any weight.
Through robots.txt file, you can even specify the sitemap of a website. If you check my robots.txt file, “http://cutewriting.blogspot.com/robots.txt”, you will see that a sitemap has been specified. Conclusion
Before you purchase links from any website for SEO (if you are purchasing at all) or going for link exchange, first look for the robots.txt file and see if your links are going to get any weight at all. If not, that link exchange simply won’t work.
By the way, this article or this blog does not recommend purchasing backlinks for the purpose of SEO. If you are purchasing backlinks, it should be for traffic and the links should be Nofollow. Purchasing backlinks for SEO can get your site banned by Google easily.
In the next article of Link Analysis series, we will see another important thing to check for before exchanging links: The Robots Meta tag. Subscribe and enjoy.
Yesterday, I had a discussion in the Digital Point Forums about backlinks and their validity. It seems that most people are not knowledgeable about backlink validity analysis. People know only about DoFollow and NoFollow, and nothing above that. Here, we will see the importance of ‘robots.txt’ file in the backlink analysis. Robots.txt is a simple link invalidation secret several professional bloggers won’t share with you.
What Is Robots.txt
When a search crawler accesses a website for searching and indexing, the first thing it looks for is the Robots.txt file. If it doesn’t find one, it goes about normally indexing the page.
A robots.txt file is a small text file on the root directory of the domain of a web page that directs search engine crawlers as to which of the pages or sections of the website shouldn’t be indexed. Its general format is thus:
User-agent: Googlebot
Disallow: /links.html
The above code simply disallows the Google bot from accessing the links page of a website (whose relative path is /links.html). This means, if you are an advertiser and you exchanged links with this person, who has disallowed the links page, your backlink from that page holds no value whatever. But it willdefinitely look like a DoFollow backlink to the untrained eyes. The search bots will normally index any pages not mentioned in the Robots.txt file.
User-agent: *
Disallow: /page/categories.htm
This code disallows the page for all search bots, not only Google’s. The wildcard, ‘*’ is used to specify that all search bots are to follow the rule. So, none of the search bots will index the mentioned page.
A careful, clever advertiser thinks from the search bot point of view and first looks for the Robots.txt file of the pages he plans to purchase links from. He merely won’t purchase backlink from any disallowed page.
Checking Robots.txt File of Any Website
You can easily check the Robots.txt file of any website out there. It’s right there in the root directory of the domain. It is named that way, ‘robots.txt’. You may just access it through the browser. For instance, if you need to access the Robots.txt file of Microsoft.com, just go to the main home page of Microsoft, and put this on the address bar:
“http://www.microsoft.com/robots.txt”
Google: “www.google.com/robots.txt”
Simple? You will notice that these sites have disallowed a lot of internal pages from being indexed. This is why these internal backlinks do not show up in search results or hold any weight.
Through robots.txt file, you can even specify the sitemap of a website. If you check my robots.txt file, “http://cutewriting.blogspot.com/robots.txt”, you will see that a sitemap has been specified. Conclusion
Before you purchase links from any website for SEO (if you are purchasing at all) or going for link exchange, first look for the robots.txt file and see if your links are going to get any weight at all. If not, that link exchange simply won’t work.
By the way, this article or this blog does not recommend purchasing backlinks for the purpose of SEO. If you are purchasing backlinks, it should be for traffic and the links should be Nofollow. Purchasing backlinks for SEO can get your site banned by Google easily.
In the next article of Link Analysis series, we will see another important thing to check for before exchanging links: The Robots Meta tag. Subscribe and enjoy.
0 comments:
Post a Comment