Many websites use robots.txt to block Google to index their pages and thus preventing the pages from showing up in the search engine. But the fact is that robots.txt doesn’t actually do the latter, even though it does prevent your site from being indexed. Before getting back to the explanation why Google does that? Lets have a view at some basic terms involved:-
Indexed / Indexing – The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to it’s “index”.
Ranking / Listing – Showing a site in the search result pages (aka SERPs).
So when the process of indexing to listing, it is not necessary that the site should be indexed to get listed. If a link points to a page domain or wherever, that link will be followed. If you block that page from robots.txt still that page will be followed and will be listed in the search results. Here is what Matt Cutts explains why a page that is disallowed in robots.txt may still appear in Google’s search results.
So if you want to effectively hide any pages from appearing in the search results, you need them to get indexed. When search index those pages you can tell them not to list them. The tag below does that for you.
<meta name="robots" content="noindex,nofollow"/>
You need to copy them to all your pages which you don’t want to get indexed by search engines. In WordPress there is a Robots Meta option in the Edit Post page in the right hand side column (under the categories). To not list them in search engines, just select noindex, nofollow.
If you have something to ask or share, do add you comment.








{ 17 comments… read them below or add one }
This will come in handy, thanks for sharing Shubham
but why should be restrict the site from indexing.. yeah of course personal blogs doesn’t need indexing .. anyways thanks for the share..
.-= Rajesh Kanuri @ TechCats´s last blog ..Download & Upgrade To Firefox 3.5.7 : Fixed Stability Issue =-.
dude its prettty simple,no one wants their author page or maye some personal posts to be indexed as it will dilute your real content and Matt has made clear in the video that you should not link to your unwanted pages.hope now you get the logic behind this…..
I won’t post ‘personal’ stuff on blog in the first place. So I don’t care about the indexing issue.
.-= Jayce´s last blog ..How to install Garmin Mobile XT for Windows Mobile Phone? =-.
Love Matt cutts explanation.. pretty clear and helpful
.-= Bangaloreloka´s last blog ..Lazy blogger? How to blog posts Articles regularly! =-.
I am Glad you liked it..!
That’s how it’s work….
.-= blinkky´s last blog ..Free Arcade Script =-.
Cool, I have Used no index and no follow tags during my blogspot to wordpress redirection
.-= Chethan´s last blog ..Abhishek Bachchan as a Tree in New Idea Ad. Use Mobile Save Paper =-.
I used robots.txt to prevent pages and post easily
.-= Tinh´s last blog ..Firefox 3.6 Officially Released with Over 2M Downloads =-.
Nice post bro… earlier i did this from the wordpress dashboard, (which actually changes the robots.txt) when i was moving to my new blog!
.-= Pubudu Kodikara´s last blog ..How to Install Firefox 3.6 on Ubuntu =-.
nice and informative post subham ….keep the spirit
.-= Nitesh Patel´s last blog ..How to edit Pdf files free without Adobe Acrobat =-.
Thanks Nitesh for reading my blog regularly..!
.-= Shubham´s last blog ..Test Your Cross Browsing Compatibility =-.
Excellent info, didn’t knew that.. thanks
should we make noindex to daily archives, monthly and yearly archives ?
.-= Srivathsan G.K´s last blog ..Proofreading Your Articles Made Easy with After The Deadline Wordpress Plugin =-.
I am making a landing page specifically for an advertisement that will be linked to from one external site only (then possibly others if someone pastes the link somewhere). However, I do not want that specific landing page being found through search engines. Does this make sense? If so, is it possible to do with this code or something else? Does the no follow mean that if someone pastes the link or the page I am advertising on gets crawled, that it will not get indexed? Thanks you.
Yes, we can using noindex property for the robots file to avoid Google and all other search engines to do that.
Debt´s last [type] ..Three UK Launches Android 22 FroYo Update For Samsung Galaxy S
Hi,
I have a problem in using the meta tag approach to prevent two of my urls from being indexed.
I had earlier used Disallow in robots.txt for the same purpose but it didn’t work for me.
The urls I want to prevent from indexing are of the type:
http://www.kjhj.co.uk/uk/home/care.aspx?tab=2&view=1&….
and
http://www.kjhj.co.uk/uk/home/care.aspx?tab=2&view=2&….
Clearly using meta tag approach will prevent my entire site from being indexed whereas I want only the above two urls to be indexed. For your information I have got several tabs and their respective ‘view’ query string values.
Can someone please suggest.
Kind Regards