This article describes in detail how different HTTP status code, as well as network and DNS errors affect the visibility of the site in the Google search engine. The most common server answers and the main types of errors that the search bot can encounter when going around your site are considered. More rare codes and protocols are not considered. All described problems cause appropriate errors or warnings in the reports of indexing your site in the Google Search Console.
The experimental functions of the HTTP and FTP protocols are not supported if the text is not indicated otherwise.
HTTP Status code is the answers of the server that the client is addressed, whether it be a browser or a search robot when requesting a site page. Each code has a certain value, but often the process of processing is similar. For example, several codes indicate a forwarding, but the result is a hit on the final URL.
Google Search Console records errors for codes from the 4xx and 5xx range, as well as for unsuccessful redirects (3xx). If the server returns the 2xx code, the contents of the page can be accepted for indexation, but there is no guarantee.
Below is a table with the main HTTP status code, which are most often found when scanning the site and their influence on indexing in Google.
Such codes mean the successful transmission of the contents that Google can process for indexation. However, if the page contains an error, for example, empty content or error message, Google can classify it as a soft error 404.
200 (OK) - The page is successfully loaded, the contents are transferred to the indexing system. Indexing is possible, but not guaranteed.
201 (created), 202 (accepted) - Googlebot expects the contents to get a limited time, after which it sends available data for indexation. Waiting time depends on the type of agent.
204 (no contents) - Googlebot tells the indexing system that there is no contents. In the indexation report, this may look like a soft error 404.
Googlebot can cross a chain of a maximum of 10 redirects. If the limit is exceeded and the content is not received, the redirect error will be in the indexation report. The number of transitions depends on the type of Googlebot.
All content from the redisters URL is ignored, and only the final URL is accepted for indexing. For Robots.txt files with 3xx codes, special rules are used.
301 (moved forever) - Googlebot follows the redirect and considers the final URL the main, passing it on the weight of the page.
302 (temporarily moved) - Googlebot follows the redirect, but the signal to canonicality is weaker.
303, 304 (did not change) - Google reports that the contents have not changed from the last visit, and does not index again.
307, 308 - similarly 302 and 301, respectively, but semantically different. For your site, use the right code for better compatibility with other customers.
Pages with 4XX answers are not considered for indexation, and if they are already in the index, they are removed. The content of such pages is completely ignored.
400 (incorrect request) And the other 4xx, except 429, mean that there is no content, and the URL is excluded from the index. The frequency of bypassing such pages is gradually reduced.
Do not use 401 and 403 to limit the detour frequency - these codes do not affect the scanning speed. To restrict the bypass, use special settings.
401 (non -authorized), 403 (prohibited), 404 (not found), 410 (deleted), 411 (required length)
429 (too many requests) - Google perceives this code as a sign of server overload, belonging to server errors.
Mistakes of 5xx and 429 cause a temporary decrease in the speed of bypassing the site. Already indexed URLs are preserved, but with constant errors will be removed from the index.
Pages from 5xx are not taken into account when indexing. For Robots.txt, there are separate rules with 5xx.
500 (server internal error) - Google reduces the frequency of site bypassing depending on the number of such errors.
502 (bad gateway), 503 (service is not available) - Similar actions to restrict scanning.
Soft 404 error occurs if the page returns the status of 200, but contains an error message or empty contents. This can be caused by technical problems, for example, the absence of included files or an empty search page.
Such pages create poor user experience and are excluded from Google indexation. The Search Console report displays a warning about a soft 404 error.
Solution options depend on the situation and the desired result:
Return HTTP Status 404 or 410 for a page without replacement so that the search engines know that it needs to be removed from the index. Set up the user page 404 with useful tips and navigation for visitors.
User 404 must return the code 404 to avoid indexing such pages.
Set up a constant redirect 301 to a new page with a similar content so that users and search engines correctly switch to relevant information. Check the correct response through the URL checking tool.
Perhaps Googlebot could not correctly load the page due to the lack of resources, errors in the code or locks. Use the URL testing tool to view the page drawing and http code. Problems with the loading of resources, such as scripts and images, can lead to a mild 404 error.
The main reasons are locks in robots.txt, too many resources on the page, server errors, slow download or too large files.
Network and DNS errors quickly negatively affect the site’s position in the search. Googlebot, when detecting timeouts, connection discharges or problems with DNS, begins to reduce the detour frequency, as he understands that the server does not cope with the load.
Since the content is not obtained with such errors, Google cannot index pages, and previously indexed pages that have become unavailable are removed from the search in a few days. In the reports of Search Console there are corresponding errors.
If you do not control the server yourself, we recommend that you contact your hosting or CDN provider.
Network errors can occur before the processing of the request by the server or in the scanning process. The absence of HTTP code complicates the diagnosis. To eliminate errors in the time out and the connection reset:
Problems can be associated with overloading network interfaces or improper closing of ports, which leads to loss of packages and discharge of connections.
Most often, DNS errors are caused by improper setting or blocking requests at the firewall level. To diagnose the following steps:
For any questions to improve the indexing of your site on Google, we recommend that you contact the SEO company CEO by email info@seo.computer Or through WhatsApp +79202044461.
ID 70