Google Search is a fully automated search engine that uses software known as web crauls that regularly explore the Internet to find pages that can be added to the index. In fact, most of the pages that are displayed in the search results were not manually sent for indexation, but were found and added automatically when studying the Internet with web crauls. This document explains in detail how the search in Google works in the context of your site. Knowing this process will help you correct scan errors, get pages in the index and optimize the site display in searching for Google.
Looking for something less technical? Check out the site How Search Workswho explains how the search works from the point of view of the user.
Before plunging into the details of the search engine, it is important to note that Google does not accept payment for a more frequent scan of the site or for higher positions in the search results. If someone claims the opposite, know that this is not true.
Google does not guarantee that the site will be scanned, indexed or displayed in search results, even if its pages correspond to Google Search Essentials.
Search in Google takes place in three stages, and not all pages go through each of them:
The first stage is the search for pages on the Internet. Since there is no central registry of all web pages, Google is constantly looking for new and updated pages to add them to its list of famous pages. This process is called the "search for the URL." Some pages are already known because Google has already visited them. Other pages are found when Google extracts a link from a well -known page to a new one. For example, if the site category refers to a new article on the blog. You can also send a list of pages (site map) so that Google can index them.
As soon as Google finds the URL page, he can visit this page to understand what it is contained. For this, a large number of computers are used that scan billions of pages on the Internet. The program that performs this task is called Googlebot (also known as Crauler, Robot, Bot or Spider). Googlebot uses the algorithm to determine which pages to scan, how often and how many pages you need to download from each site. Googlebot is also tuned in such a way as not to overload the servers of sites, scanning them too often. This mechanism depends on the response of servers (for example, HTTP 500 errors indicate that you need to slow down the process).
However, Googlebot does not always scan all the pages that it finds. Some pages may not be available to Kraler because of the site settings, for example, due to a ban on scanning through the Robots.txt file or due to the need to authorize for access.
During Crailingin, Google also renders the page and performs JavaScript, using the current version of the Chrome browser, which helps to understand which content is displayed on the page. This is important, since many sites depend on JavaScript to display content, and without rendering Google may not see important data.
After the page was scanned, Google is trying to understand what this page is talking about. This stage is called indexing, and it includes the processing and analysis of text content, as well as metathegs, such as tags
During indexing, Google determines whether the page is a duplicate of another page on the Internet. If the pages are similar, a canonical version is selected, which will be displayed in search results. This is important, since different pages with the same content can be represented in different contexts (for example, on mobile devices or when searching for specific versions of pages). The canonical page is the one that Google has chosen as the most relevant to search.
Google also takes into account various signals about the canonical page and its content that can be used when issuing a page in the search. For example, it can be such parameters as the language of the page, the country to which content is tied, and the convenience of the page for the user.
If the page does not go through the indexation process, this is possible, this is due to the low quality of the content or with the installation of meta-right prohibiting indexing. It is important to understand that not all the pages that Google processes will be indexed.
Google does not accept payment to raise positions in the search results, and all this happens automatically. When the user enters the request, the system scans the index and returns the pages, which, according to Google, are the most relevant request. To assess the relevance, various factors are used, such as the user location, language and device (for example, mobile or desktop).
For example, the search for “Bicycle Repair” will show different results to the user in Paris and Hong Kong. It also depends on which content is the most relevant in specific conditions.
In addition, depending on the request, the search elements that are displayed on the page also change. For example, the request “Bicycle Repair” can withdraw local results, and the “Modern Bicycle” request is to show images, but not local results.
Sometimes Google Search Console may report that the page was indexed, but it does not appear in the search results. This may be due to several factors:
Attention, we are constantly working to improve our algorithms. Follow the changes following the blog Google Search Central.
If you have questions about optimizing the search for your site, you can contact SEO.compter by e -mail info@seo.computer or through WhatsApp by number +79202044461.
ID 160