Google uses kraler and vests to perform various actions for its products, both automatically and at the request of the user. Crowler (sometimes also called a “robot” or “spider”) is a common name for a program that is used to automatically detect and scan websites. TheTs act as a program similar to WGET, and usually perform a single request on behalf of the user. Google products can be divided into three categories:
Conventional kraler used for Google products (for example, Googlebot) always follow Robots.txt rules for automatic scanning.
Special kraler are similar to ordinary ones, but they are used in specific products where there is an agreement between the scanned site and the Google product regarding the scanning process. For example, Adsbot ignores global Robots.txt user-agent with the resolution of the site owner.
The user’sTiders are part of the product and product functions when the final user initiates the request. For example, Google Site Verifier works at the user's request.
Google Craolers and Veters are designed to work simultaneously on thousands of machines in order to increase performance and scalability as the Internet grows. To optimize the use of the bandwidth, these customers are distributed according to many data centers around the world, so they are located closer to the sites to which they can be contacted. Therefore, visits from various IP addresses can be indicated in your logs. Google Egress mainly takes place with IP addresses in the United States. If Google finds that the site blocks requests from the United States, he may try to scan from IP addresses located in other countries.
Google Craolers and Veters support HTTP/1.1 and HTTP/2. Craolers will use the version of the protocol, which provides the best scanning performance, and may switch between protocols depending on the statistics of previous scan sessions. By default, Google Kraler uses the HTTP/1.1 protocol. Scan through http/2 can save computing resources (for example, CP, RAM) for both your site and Googlebot, but otherwise it does not give any specific advantages for the site (for example, this does not affect the rating in Google Search). To refuse scanning through HTTP/2, set the server so that it responds with status 421 when Google try to access your site through HTTP/2. If this is not possible, you can send a message to the scanning team (although this is a temporary decision).
The Google Croilers infrastructure also supports scanning through FTP (as defined in RFC959 and its updates) and FTPS (as defined in RFC4217 and its updates), however, scanning through these protocols is rare.
Google Craolers and Testers support the following methods of compression of content (encoding): Gzip, Deflate and Brotli (BR). Supported content encoding for each Google user agent are indicated in the Accept-Encoding header for each request that they make. For example: Accept-Encoding: Gzip, Deflate, Br.
Our goal is to scan as many pages of your site as possible at every visit without overloading the server. If your site has difficulty serving requests from Google, you can reduce the scanning speed. Please note that sending the wrong HTTP Status to Google Kraolers can affect how your site will be displayed in Google products.
Google Kraler's infrastructure supports the heuristic HTTP-shows, as determined by the HTTP-caching standard, in particular through the ETAG and IF-None-Match headlines, as well as through the Last-Modified and IF-Modified-Since linen headings.
Note: It is recommended to set the values of ETAG and LAST-Modified, regardless of the preferences of Google CRuls. These headers are also used by other applications such as CMS.
If both the ETAG and Last-Modified fields are present in the return headings, Google Kraler uses the ETAG value, as the HTTP standard requires. For Google Croilers, we recommend using ETAG instead of Last-Modified to indicate the preferences of caching, since ETAG has no problems with dates formatting.
Other directives of HTTP caching are not supported.
Separate Crowlers and Google Veters can or cannot use caching depending on the needs of the product with which they are associated. For example, Googlebot supports caching when repeated scanning URLs for Google Search, and Storebot-Google supports caching only under certain conditions.
To implement HTTP coding for your site, contact your hosting provider or supplier of the content management system.
Google Craul infrastructure supports ETAG and IF-None-Match, as determined by the HTTP-caching standard. Learn more about the ETAG header and its IF-None-Match request.
Google Kraler's infrastructure supports Last-Modified and iF-Modified-Since, as determined by the HTTP-caching standard, with the following reservations:
Learn more about the Last-Modified heading and its IF-Modified-Since request.
Google Craules identify themselves in three ways:
Find out how to use these data to check Google cholera and Frinders.
If you have questions about SEO or you will need help, contact our SEO companion by email info@seo.computer or through WhatsApp: +79202044461.
ID 61