What Is Googlebot And Why Should You Care?
This post is about Other Search Engines, SEO and it received 0 comment
Jon from LocalSEO.org has provided this post and he is going to explain about Googlebot and how it works to find and rank your website.
Googlebot is Google’s site crawling robot; it finds pages on the web and passes them off to the Google Algorithm for processing. In reality Googlebot doesn’t crawl the web at all. It works like a web browser, it sends a request to a server for a web page, downloads the whole page, and then moves on to the next page.
Googlebot consists of numerous computers requesting and fetching pages faster than you can with your web browser. In fact, Googlebot can request thousands of different pages in chorus. To evade overpowering web servers, or crowding out requests from human users, Googlebot intentionally makes requests of each individual web server slower than its capacity.
How it works?
While Googlebot fetches a page, it weeds out all the links appearing on the page and adds them only to what it considers are high-quality pages. By harvesting links from every page it encounters, Googlebot can rapidly make a list of links that can cover the web. This method, known as deep crawling, also allows Googlebot to search deep within individual sites. Due to their massive size, deep crawls can reach more or less every page online. Because the web is enormous this process can take some time, so some pages may be crawled only once a month.
Googlebot was intended to be distributed on several (thousands) of machines to improve performance and scale as the web grows. Also, to cut down on bandwidth usage, they run numerous crawlers on machines located near the sites they’re indexing in the network. As a result, your logs may show visits from quite a few machines at google.com, all with the user-agent Googlebot. The goal is to crawl as many pages from a site as possible on each visit without overwhelming the user’s server bandwidth.
Even though its purpose is simple, Googlebot must be programmed to handle several challenges. First, since Googlebot sends out real-time requests for thousands of pages, the queue of “visit soon” URLs must be constantly examined and compared with URLs already in Google’s index. Duplicates in the queue must be purged to avoid Googlebot fetching the same page again. Googlebot must decide how frequently to re-examine a page. On one hand, it’s a waste of resources to re-index an untouched page. On the other hand, Google wants to re-index altered pages to deliver up-to-date results.
Googlebot shouldn’t access a site more than once every few seconds. However, due to network delays, it’s possible that the rate will appear to be slightly higher over short periods. In general, Googlebot ought to download only one duplicate of each page at a time. If you see that Googlebot is downloading a page multiple times, it’s probably because the crawler was stopped and restarted.
Versions of Google Bot:
Googlebot has two versions namely Deepbot and Freshbot:
Freshbot crawls the website and looks for fresh content. Freshbot visit the website that frequently change, depending on how often.
Deepbot tries to follow every link on a website and download as many pages as possible of your website. This process will be completed about once a month. But, may be slower if your site is stagnant and not often updated.
Now that you have a great basic understanding of what Googlebot is and how it works it is important to make the most of this situation. For example, knowing that Freshbot exists it would be a very good idea to update your site frequently so that Google sends Freshbot your way. All things equal a site that is updated more often will be ranked higher than one that isn’t touched by its owner. So, start updating your site with high quality fresh content and Google will reward you.
Jon is a local SEO expert at localseo.org. He works with small to medium-sized business owners to provide them high rankings and quality traffic via Google Places and Organic SEO.
Originally posted on: Written on September 22, 2011, ThursdayLooking for older posts? Check the Articles archive!