Special search robots are used by separate services, while special agreements are established between the site and the service under the scanning conditions. For example, a separate robot can ignore the general rule in Robots.txt for all agents (*) if there is consent of the site owner.
Such robots work according to other IP bands than standard search engines. The list of these IP addresses is placed in a special JSON file. The reverse DNS recording of these IPs may correspond to the Rate-Limited-Proxy-***. ***. ***. ***. Google.com.
Below is a list of special robots used by different services, their user-agent lines in HTTP checks, designations for Robots.txt, as well as a description of the influence of their settings on indexing behavior. The list is not complete, but contains the most common agents in the website logs.
User-agent in HTTP checks: APIS-HOOOGLE
User- Agent token at Robots.txt: APIS-HOOOGLE
General rules indicated through *, are not taken into account by this robot.
An example of settings in Robots.txt:
user-agent: APIs-Google allow: /archive/1Q84 disallow: /archive/
The settings for this user-agent affect the delivery of push notifications through the API.
User-agent in HTTP checks: Adsbot-Google-Mobile
User- Agent token at Robots.txt: Adsbot-Google-Mobile
General directives are ignored.
user-agent: AdsBot-Google-Mobile allow: /archive/1Q84 disallow: /archive/
This agent checks the quality of advertising on the pages of your site used in advertising products.
User-agent in HTTP checks: Adsbot-Google
User- Agent token at Robots.txt: Adsbot-Google
The general user-agent is ignored.
user-agent: AdsBot-Google allow: /archive/1Q84 disallow: /archive/
Used to assess the quality of advertising content on the pages of the site.
User-agent in HTTP checks: Mediapartners-Google
User- Agent token at Robots.txt: Mediapartners-Google
Ignores general directives in Robots.txt.
user-agent: Mediapartners-Google allow: /archive/1Q84 disallow: /archive/
Used to scan the site in order to show relevant advertising.
User-agent in HTTP checks: Google-Safety
User- Agent token at Robots.txt: Not applied - ignores the rules.
It is used to detect malicious links and other suspicious activity on the pages of the site. This agent does not obey the Robots.txt settings, as it serves to protect users.
The following are listed agents that were previously used, but are currently no longer active. Information is given for reference.
User-agent: Adsbot-Google-Mobile
User-agent token: Adsbot-Google-Mobile
It was used to assess the quality of advertising on mobile devices, for example, smartphones.
User-agent: Duplexweb-Google
User-agent token: Duplexweb-Google
This agent could ignore general rules, used in interactive services.
User-agent: Google Favicon
User-agent token: Googlebot-Image, Googlebot
He was responsible for the collection and display of the Faviko site in various interfaces.
User-agent: Adsbot-Google-Mobile-Apps
User-agent token: Adsbot-Google-Mobile-Apps
Carried out scanning pages of Android applications to evaluate their quality and compliance with advertising.
User-agent: Googleweblight
User-agent token: Googleweblight
It was used only in the transition of a real user through a special search mode. I ignored Robots.txt, as it was not considered an automatic scanner.
This agent checked the availability of the No-Transform header on the pages of the site to optimize their display during slow connection.
If you need to clarify the rules for setting up Robots.txt, special robots or any other SEO questions, contact SEO.computer By email: info@seo.computer or WhatsApp: +7 920 204-44-61
ID: 86