How Google Works? [Behind the Search Box!]

 Intro…  

What is Search Engine?

A search engine is an information retrieval system designed to help find information stored on a computer system.

Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload.

The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web.

Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day.

Search engines utilize automated software applications (referred to as robots, bots, or spiders) that travel along the Web, following links from page to page, site to site. Popular examples of search engines are Google, Yahoo!, and MSN Search.

The Google Search Engine

Google’s search engine is a powerful tool. Without search engines like Google, it would be practically impossible to find the information you need when you browse the Web.

Like all search engines, Google uses a special algorithm to generate search results. While Google shares general facts about its algorithm, the specifics are a company secret.

Google uses automated programs called spiders or crawlers, just like most search engines. Crawlers follow instructions provided in robots.txt of a particular website.

Also like other search engines, Google has a large index of keywords and where those words can be found.

Google bot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer.

What sets Google apart is how it ranks search results, which in turn determines the order Google displays results on its search engine results page (SERP).

Google uses a trademarked algorithm called PageRank, which assigns each Web page a relevancy score.

Hummingbird, a new algorithm that is designed to apply meaning technology to billions of pages across the web.

The great success of Google Inc. is attributed not only to its efficient search algorithm,

but also to the underlying commodity hardware and, thus the file system. As the number of applications run by Google increased massively,

Google’s goal became to build a vast storage network out of inexpensive commodity hardware.

Google created its own file system, named as Google File System in 2003. Google File system is the largest file system in operation. Which we will cover in another article.

So, let’s dive into Google…

 GOOGLE BOT :  

 What is Googlebot ?  
  • Googlebot is the WebCrawler used by Google to find and retrieve web pages.
  • Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer.
  • It’s easy to imagine Googlebot as a little spider scurrying across the strands of cyber-space, but in reality, Googlebot doesn’t traverse the web at all.
  • It functions much like a web browser, by sending a request to a web server for a web page, downloading the entire page, and then handing it off to Google’s indexer.
  • Googlebot visits billions of web pages and is constantly visiting pages all over the web.
 What is a WebCrawler?  
  • Web crawlers (also known as bots, robots or spiders) are a type of software designed to follow links, gather information and then send that information somewhere.
 WHAT IS GOOGLE INDEXER? 
  • Google indexer is a software program that examines the occurrences of pre-defined words in the web pages it receives from Googlebot.
  • Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database.
  • This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs.
  • This data structure allows rapid access to documents that contain user query terms.
 What does Googlebot do?  
  • Googlebot retrieves the content of web pages (the words, code, and resources that make up the webpage).
  • If the content it retrieves has links to other things that is noted.
  • It then sends the information to Google.
 How Googlebot “sees” a webpage  
  • Googlebot consists of many computers requesting and fetching pages much more quickly than we can with our web browser.
  • In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it’s capable of doing.
  • If any of those components are not accessible to Googlebot, it will not send them to the Google index.
  • For Google to be able to rank your web pages optimally, Google needs the complete picture.

There are many scenarios where Googlebot might not be able to access web content, here are a few common ones.

  • Resource blocked by robots.txt
  • Page links not readable or incorrect
  • Over-reliance on Flash or other technology that web crawlers may have issues with
  • Bad HTML or coding errors

If you have a Google account use the “fetch and render” tool found in the Google search console. This tool will provide you with a live example of exactly what Google sees as an individual page.

Googlebot follows the instructions it receives via the robots.txt standards and even has advanced ways to control it that are Google specific.

 What is a robots.txt file?  
  • The robots.txt file controls how search engine spiders like Googlebot see and interact with your web pages.
  • In short, a robots.txt file tells WebCrawler what to do when it visits your pages by listing files and folders that you do not want a WebCrawler to access.

  • Here is the example of robots.txt file of “springer.com”
GOOGLE HUMMINGBIRD ALGORITHM :
  • At the end of September, as part of its 15th-anniversary celebrations, Google an-nounced the introduction of Hummingbird, a new algorithm that allows the search en-gine to process and sort its index more efficiently.
  • With this new algorithm, Google is better able to understand the meaning of a phrase and return more precise results to complex search queries.
  • The new algorithm, named Hummingbird because it is fast and precise, is the most comprehensive algorithm change Google has made since 2001 according to Amit Singhal, the head of Google’s ranking team.
  • Google Hummingbird is designed to apply the meaning technology to billions of pages from across the web, in addition to Knowledge Graph facts, which may bring back bet-ter results.
Difference between old and new search techniques

Fast and Accurate + Semantics
  • In addition to the fast and accurate technical aspects of the Hummingbird change, there has been a major move towards semantic search.
  • Google has immensely improved its understanding of search queries – even long-tail search – and therefore user intention even better.
  • The entire query, and the relations of word groups within search queries, can be in-creasingly targeted, identified and interpreted.
Contextual and Conversational Search
  • The Hummingbird improvements are particularly strong in Contextual and Conversa-tional search, two areas that are strongly linked to fundamental semantics and the rela-tionships between words.
  • In Contextual search, Google increasingly returns results that match the intention be-hind the query.
  • Results are no longer limited to the words themselves, but include an Interpretation of intent for the search terms.
Contextual Search – An Example:
  • Search Term: “Richest person”
  • Google interprets the query and returns an answer in the search results as a Knowledge Graph integration.
  • As more searches are voice controlled – Conversational search – searches be-come more long-tailed and often involve whole sentences.
  • The search query is longer and is often made up of a complete set of questions.

An Example Of Conversational Search:
  • Search Term: “Who is the richest person in the world”
  • As can be seen above, Google breaks the question down into its component parts and provides results that are nearly identical to the original keyword search result.
  • Users of Google Chrome may have noticed a small microphone icon in the right hand corner of Google’s search box if the user clicks on that microphone they may ask aloud the question they would have typed into the search box.
  • The question is then displayed on the search screen, along with the results.

Features of Hummingbird :

Knowledge Graph :

  • Hummingbird expands the use of the Knowledge Graph, so that Google an-swers more complex search queries and also improves the follow-up search pro-cess.
  • For example, if we first search “Who is Mahendra Singh Dhoni?” and then do a second search for “How Many Runs He Made?” Google will understand the context of your second query.

Comparisons :

  • The knowledge graph enables more comparisons between search objects (ex: “Dhoni vs. Kohli”, “India vs. Pakistan” etc.).

Geo-location enhancement :

  • If someone asks “What is the best place to buy an iPhone 6s?” then Google will likely bring a result near to his current location.

Improved mobile search design and functionality :

  • Voice search and Android/iPhone synchronization are improved and will likely continue to improve quickly.

What does Hummingbird mean for SEO?

  • Does Hummingbird mean the end for SEO? Definitely not.
  • While any change to Google’s search engine protocols is routinely answered with a cry of frustration from SEOs and webmasters, Hummingbird shows no signs of significantly changing the search engine optimization landscape.
  • If you are following best SEO practices, there should be little or no adverse effects on your sites.
  • In my opinion, this search update gives us SEOs some advantages.
    How you ask?
    Simple.
    The title of this topic was purposely written in the form of a question.
    Using this concept and the 5 W’s (who, what, when, where, why, and of course how) will get your content one step close to rising to the top of search.
  • However, Hummingbird will also place a renewed emphasis on authoritative quality content.
  • If your SEO strategy is too heavily weighted towards keyword deployment, at the ex-pense of creating authoritative content and quality links, your sites will continue to lose traction and rankings.
  • But Google said that as long as you have been following their age-old rule to make original and high quality content, then there’s nothing really to worry about since the Hummingbird was just meant to process information in a different way.

What for keywords having multiple meaning?

  • Google tries to show results nearest to the intention of the query and hence the user… By showing multiple possible Knowledge Graphs.

Conversational and Relative Search
  • Google uses your previous search context and gives result relative to the previous re-sult.
  • For example, if we first search “Who is Mahendra Singh Dhoni?” and then do a second search for “How Many Runs He Made?” Google will understand the context of your second query.

 

Important things to optimize website for new algorithm : For getting high Google rank of our website/blog :
  • A Professional Website with Good Content.
  • Good Backlinks from Related Sites.
  • Content of your website should original and must related to the topic/website.
Limitations:
  • The knowledge graph still does not incorporate many languages.
  • The Algorithm result is not applicable to all countries.
Old SEO :
  • How do I rank for this Query (keyword)?
New SEO :
  • How do I best answer the questions my user have?

CONCLUSION

  1. Today the amount of information available on the Web is growing rapidly.
  2. Search Engine technology had to scale according to the growth of the Web.
  3. Web searching technology has been evolving very rapidly and will continue to evolve.
  4. Google is designed to be a scalable search engine.
  5. Google employs a number of techniques or methods to improve search quality includ-ing page rank calculation, fast & accurate search algorithm, efficient own file system, large index, intelligent web crawlers and much more.
  6. Hummingbird has resulted in an overall decrease of diversity in search results, and this is particularly true for keywords with similar semantics.
  7. Knowledge Graph integrations may not be common, but they are becoming increasing-ly specific.
  8. The updates Google has made with Hummingbird address the changing nature of search as queries become more complex and consumers rely more heavily on local, nav-igational and voice-powered search.
  9. To take full advantage of these algorithm changes, brands need to ensure their website content provides information to consumers’ most common questions.
  10. If your brand is able to provide clear and easy-to-find answers to brand-related topics, it will drive traffic to your site and loyalty among consumers.
  11. In short, Google is getting better at understanding the search intent of the user and producing matched search results – quickly and accurately.

You May Also Like

Leave a Reply

Your email address will not be published. Required fields are marked *