How http status code, network and DNS errors affect the site indexing in Google

How http status code, network and DNS errors affect the site indexing in Google

This article describes in detail how different HTTP status code, as well as network and DNS errors affect the visibility of the site in the Google search engine. The most common server answers and the main types of errors that the search bot can encounter when going around your site are considered. More rare codes and protocols are not considered. All described problems cause appropriate errors or warnings in the reports of indexing your site in the Google Search Console.

The experimental functions of the HTTP and FTP protocols are not supported if the text is not indicated otherwise.

HTTP Site status and their impact on Google indexation

HTTP Status code is the answers of the server that the client is addressed, whether it be a browser or a search robot when requesting a site page. Each code has a certain value, but often the process of processing is similar. For example, several codes indicate a forwarding, but the result is a hit on the final URL.

Google Search Console records errors for codes from the 4xx and 5xx range, as well as for unsuccessful redirects (3xx). If the server returns the 2xx code, the contents of the page can be accepted for indexation, but there is no guarantee.

Below is a table with the main HTTP status code, which are most often found when scanning the site and their influence on indexing in Google.

2xx (Successful answers of the site server in Google)

Such codes mean the successful transmission of the contents that Google can process for indexation. However, if the page contains an error, for example, empty content or error message, Google can classify it as a soft error 404.

200 (OK) - The page is successfully loaded, the contents are transferred to the indexing system. Indexing is possible, but not guaranteed.

201 (created), 202 (accepted) - Googlebot expects the contents to get a limited time, after which it sends available data for indexation. Waiting time depends on the type of agent.

204 (no contents) - Googlebot tells the indexing system that there is no contents. In the indexation report, this may look like a soft error 404.

3xx (site forwarding and processing in Google)

Googlebot can cross a chain of a maximum of 10 redirects. If the limit is exceeded and the content is not received, the redirect error will be in the indexation report. The number of transitions depends on the type of Googlebot.

All content from the redisters URL is ignored, and only the final URL is accepted for indexing. For Robots.txt files with 3xx codes, special rules are used.

301 (moved forever) - Googlebot follows the redirect and considers the final URL the main, passing it on the weight of the page.

302 (temporarily moved) - Googlebot follows the redirect, but the signal to canonicality is weaker.

303, 304 (did not change) - Google reports that the contents have not changed from the last visit, and does not index again.

307, 308 - similarly 302 and 301, respectively, but semantically different. For your site, use the right code for better compatibility with other customers.

4xx (site client errors and consequences for indexing in Google)

Pages with 4XX answers are not considered for indexation, and if they are already in the index, they are removed. The content of such pages is completely ignored.

400 (incorrect request) And the other 4xx, except 429, mean that there is no content, and the URL is excluded from the index. The frequency of bypassing such pages is gradually reduced.

Do not use 401 and 403 to limit the detour frequency - these codes do not affect the scanning speed. To restrict the bypass, use special settings.

401 (non -authorized), 403 (prohibited), 404 (not found), 410 (deleted), 411 (required length)

429 (too many requests) - Google perceives this code as a sign of server overload, belonging to server errors.

5xx (server site errors and their impact on indexing in Google)

Mistakes of 5xx and 429 cause a temporary decrease in the speed of bypassing the site. Already indexed URLs are preserved, but with constant errors will be removed from the index.

Pages from 5xx are not taken into account when indexing. For Robots.txt, there are separate rules with 5xx.

500 (server internal error) - Google reduces the frequency of site bypassing depending on the number of such errors.

502 (bad gateway), 503 (service is not available) - Similar actions to restrict scanning.

Soft errors 404 on the site in Google

Soft 404 error occurs if the page returns the status of 200, but contains an error message or empty contents. This can be caused by technical problems, for example, the absence of included files or an empty search page.

Such pages create poor user experience and are excluded from Google indexation. The Search Console report displays a warning about a soft 404 error.

How to fix soft 404 errors for your site in Google

Solution options depend on the situation and the desired result:

  • The page and contents are completely removed.
  • The page or contents are transferred to another URL.
  • The page and contents are available and should be indexed.
If the page and contents are removed

Return HTTP Status 404 or 410 for a page without replacement so that the search engines know that it needs to be removed from the index. Set up the user page 404 with useful tips and navigation for visitors.

  • Clearly report that the page is not found using an understandable and friendly language.
  • Maintain the style and navigation of your site.
  • Add links to popular sections or main page.
  • Consider the possibility of feedback for messages about broken links.

User 404 must return the code 404 to avoid indexing such pages.

If the page or contents are moved

Set up a constant redirect 301 to a new page with a similar content so that users and search engines correctly switch to relevant information. Check the correct response through the URL checking tool.

If the page and contents are still available

Perhaps Googlebot could not correctly load the page due to the lack of resources, errors in the code or locks. Use the URL testing tool to view the page drawing and http code. Problems with the loading of resources, such as scripts and images, can lead to a mild 404 error.

The main reasons are locks in robots.txt, too many resources on the page, server errors, slow download or too large files.

Network and DNS site errors and their impact on Google indexation

Network and DNS errors quickly negatively affect the site’s position in the search. Googlebot, when detecting timeouts, connection discharges or problems with DNS, begins to reduce the detour frequency, as he understands that the server does not cope with the load.

Since the content is not obtained with such errors, Google cannot index pages, and previously indexed pages that have become unavailable are removed from the search in a few days. In the reports of Search Console there are corresponding errors.

If you do not control the server yourself, we recommend that you contact your hosting or CDN provider.

How to debug network website errors to improve indexation in Google

Network errors can occur before the processing of the request by the server or in the scanning process. The absence of HTTP code complicates the diagnosis. To eliminate errors in the time out and the connection reset:

  • Check the settings and logwall logs. Exclude blocking IP addresses of the search robot.
  • Analyze network traffic using specialized tools to detect malfunctions in network components.
  • If you do not identify the problem yourself, contact your hosting provider.

Problems can be associated with overloading network interfaces or improper closing of ports, which leads to loss of packages and discharge of connections.

How to diagnose and correct DNS site errors for successful indexation in Google

Most often, DNS errors are caused by improper setting or blocking requests at the firewall level. To diagnose the following steps:

  • Check the rules of the firewall and make sure that the IP search robot is not blocked, and the UDP and TCP queries are allowed.
  • Check the relevance of the DNS records A and CNAME, make sure that the correctness of the specified IP and names.
  • Make sure that all DNS servers are indicated correctly and work correctly.
  • If changes have been made to the DNS recently, take into account the time for the distribution of updates, and if necessary, clean the DNS cache.
  • If you control your own DNS server, make sure of its stable work and the absence of overloads.

For any questions to improve the indexing of your site on Google, we recommend that you contact the SEO company CEO by email info@seo.computer Or through WhatsApp +79202044461.

ID 70

Send a request and we will provide a consultation on SEO promotion of your website