Uploaded on Apr 10, 2020
PPT on Working of Google Bots.
Working of Google Bots.
Working of Googlebot
Googlebot
• Googlebot is the web crawler software used by Google,
which collects documents from the web to build a
searchable index for the Google Search engine. This name is
actually used to refer to two different types of web crawlers:
a desktop crawler and a mobile crawler.
Source: Google Images
Behaviour
• A website will probably be crawled by both Googlebot Desktop
and Googlebot Mobile. The subtype of Googlebot can be
identified by looking at the user agent string in the request.
However, both crawler types obey the same product token in
robots.txt, and so a developer cannot selectively target either
Googlebot mobile or Googlebot desktop using robots.txt.
Source: Google Images
How it works?
• Googlebot uses sitemaps and databases of links discovered
during previous crawls to determine where to go next.
Whenever the crawler finds new links on a site, it adds them
to the list of pages to visit next. If Googlebot finds changes in
the links or broken links, it will make a note of that so the
index can be updated.
Source: Google Images
Visiting website
• To find out how often Googlebot visits your site and what it
does there, you can dive into your log files or open the Crawl
section of Google Search Console. Google does not share
lists of IP addresses that the various Googlebot use since
these addresses change often.
Source: Google Images
Search Console
• Search Console is one of the most important tools to check
the crawlability of your site. There, you can verify how
Googlebot sees your site. You’ll also get a list of crawl errors
for your to fix. In Search Console, you can also ask
Googlebot to recrawl your site.
Source: Google Images
Optimizing Googlebot
• Getting Googlebot to crawl your site faster is a fairly
technical process that boils down to bringing down the
technical barriers that prohibit the crawler from accessing
your site properly. If Google can’t crawl your site perfectly
well, it can never make it rank for you
Source: Google Images
Sitemaps
• The crawler may not automatically crawl every page or
section. Dynamic content, low-ranked pages, or vast content
archives with little internal linking could benefit from an
accurately-constructed Sitemap. Sitemaps are also
beneficial for advising Google about the metadata behind
categories like video, images, mobile, and news.
Source: Google Images
Crawl errors
• You can find out if your site is experiencing any problems
with crawl status. As Googlebot routinely crawls the web,
your site will either subject itself to crawling with no issues,
or it will throw up some red flags, such as pages that the bot
expected to be there based on the last index. Checking out
crawl errors is your first step for Googlebot optimization.
Source: Google Images
URL Parameters
• Depending on the amount of duplicate content caused by
dynamic URLs, you may have some issues on the URL
parameter indices. The URL Parameters section allows you
to configure the way that Google crawls and indexes your
site with URL parameters. By default, all pages are crawled
according to the way that Googlebot decides.
Source: Google Images
Comments