Some search robots are not used by a standard indexing system, but by individual products with which the owners of your site may have a special agreement on scanning terms. For example, a certain robot can bypass global directives robots.txt, indicated for all agents (*), if there is permission.
Such specialized robots work with other IP bands than ordinary search engines. Actual ranges are listed in a special purpose JSON file. Their reverse DNS-records have a type of type rate-limited-proxy-***-***-***-***.google.com.
The list of such agents is given below, their user-agent lines, the values used for robots.txt And a description of the influence of their settings on the work of services. The list is not exhaustive, but includes those robots that most often appear in the logs and raise questions among site owners.
User-agent: APIS-HOOOGLE
Record in Robots.txt:user-agent: APIs-Google
Ignores global rules *.
user-agent: APIs-Google allow: /archive/1Q84 disallow: /archive/
Used by API services to control the delivery of push messages to your site.
User-agent: Adsbot-Google-Mobile
Record in Robots.txt:user-agent: AdsBot-Google-Mobile
It also ignores global rules.
user-agent: AdsBot-Google-Mobile allow: /archive/1Q84 disallow: /archive/
It is used to analyze the quality of advertising on mobile versions of the pages of your site.
User-agent: Adsbot-Google
Record in Robots.txt:user-agent: AdsBot-Google
user-agent: AdsBot-Google allow: /archive/1Q84 disallow: /archive/
The robot checks the effectiveness and compliance of advertising on the pages of the site.
User-agent: Mediapartners-Google
Record in Robots.txt:user-agent: Mediapartners-Google
Going around the directive *.
user-agent: Mediapartners-Google allow: /archive/1Q84 disallow: /archive/
Visit your site for selecting relevant advertising and its placement on the pages.
User-agent: Google-Safety
Robots.txt: It is completely ignored
The robot serves to identify malicious links and abuses on the pages of the site. Does not obey robots.txtas it works exclusively for safety reasons.
Below are agents that are no longer used, but before they could be present in logs or influence scanning behavior.
User-agent: Adsbot-Google-Mobile (outdated)
Robots.txt: I ignored global rules
It was used to assess the quality of advertising on pages open from mobile devices.
User-agent: Duplexweb-Google
Robots.txt: Could ignore the directive *
It was used by automated services to interact with the content of the site pages.
User-agent: Google Favicon
Robots.txt: Used standard tokens Googlebot-Image And Googlebot
He was responsible for the display and choice of the site icon in search interfaces.
User-agent: Adsbot-Google-Mobile-Apps
Robots.txt: Followed the rules of Adsbot-Google, ignoring *
Analyzed the pages of Android applications to evaluate their compliance with advertising requirements.
User-agent: Googleweblight
Robots.txt: I ignored the rules, as it was active only for a user request
This agent provided a simplified version of the site pages during a slow connection, checking the presence of a header no-transform.
For all issues of setting robots.txt, scanners management, as well as by SEO-optimization, you can contact SEO company SEO.computer:info@seo.computer WhatsApp: +7 920 204-44-61
ID: 86