Google Crawling and Indexing : How Search Works

Posted by on

Share this

SEO is a process which is very huge and bit confusing for beginners. Crawling and Indexing are two crucial process done by Search Engines. If you are a web developer or owner then Crawling and Indexing is more important for you to understand. If Search Engine fails to Crawl your website you are going to face lots of problem and it is expected that you are not going to get ranked on Search Engines.

In this post I am going to talk about Crawling and Indexing.

1. Crawling

 

Untitled-1

Crawling is a process in which Google uses a software called “Web Crawlers” to find accessible pages of a website. Here accessible means which is publicly available or say which has do follow commands. As All pages are not crawl-able i.e. which has given permission to get crawled will be crawled by Google others will be ignored. The Crawler “Googlebot” follows the link of pages and brings all the data about the Webpages back to Google’s Server.

If you have submitted your sitemap to Google but your Web Contents are not coming on Google it is possible they are having links on that pages with No follow directive. You might like : Do Follow and No Follow Links in SEO.

You can stop crawling of any page or part of your website by using Robots.txt file .

2. Indexing

Crawling and Indexing

Indexing takes place after crawling is finished. Basically Indexing is nothing but the process of attaching or say adding Webpages into Google Search. It is to be noted that there may be some links which are crawlable but not indexable. It depends on you if you use No-index meta tag it won’t be Indexed where as Pages with Index directive will get.

Lots of people asks the Question My Page have Follow link directive but still they are not getting indexed Why ? 

The reason can be you are allowing your page for Crawling process but not for Indexing. May be You have a noindex directive in the page. If you get this issue please Do check the Meta tag used for the page (Index or noindex).



How to deal with robots.txt file ? 

Most of the websites allow there pages to be crawled and indexed. You too should do that but please don’t forget to Disallow the following in your robots.txt file :

Never ever allow wp-admin because it is possible your website may get hacked and please change the url of admin login page. This wasn’t part of post but it is my duty to suggest better to my Readers so that they can save there time.

Liked this post ? If yes then please do share this article on Facebook and Twitter. If you have any question, feel free and friendly to ask your questions, me and my team will love to help you.

Crawling and Indexing

Last updated: May 18, 2016

Authored by :

Satyansh Tiwari is the co-founder of 3nions and currently an Electrical Engineering Student. He is a blogger, writer, graphic designer and SEO enthusiast. When he is free, he loves experiencing nature and traveling.

3 comments

aamir saleh

HI Satyansh,

It’s a great article. thank you for sharing in an easy way. let me know which robots.txt file is suitable for wordpress E-commerce site.. coz it contain filter parameter, and other non useful url. let me know any robots.txt sample for WordPress E-commerce site 🙂

HEY, Aamir
Glad you liked !
I do not know much about E-commerce sites.
But you can see robots.txt file of many popular e-commerce sites like http://www.flipkart.com/robots.txt , Amazon.in/robots.txt, etc.

Thanks for the article, it is great information.

Leave a Reply