How to find duplicates on a website using an SEO tool: step-by-step instructions

In order to get rid of duplicate pages, it is important to correctly identify them, which can sometimes be a labor-intensive process. A specialized program for finding duplicates will help with this, which allows you to detect pages that are 70-80% similar to each other. This article provides detailed instructions for finding duplicates.

  • How to find duplicate pages
  • How to exclude blocks from analysis

Video instructions

How to find similar pages on a website (duplicates)

Go to the "Configurations" section and select "Content". Then open the option to search for duplicate pages - “Duplicates”.

In the window that opens, activate the “Enable Near Duplicates” option. Next, set the similarity threshold for searching for duplicates, which is considered optimal within 80-90%. You can set the value below to get more accurate results.

After setup, we start parsing the site. Go to the “Content” section, where in the upper corner select the “Near Duplicates” line. The list that opens will display pages sorted by degree of similarity. Links are arranged in descending order of similarity, starting with 100%.

By clicking on the link, you can send it to the lower window and, if necessary, check all the page parameters. Clicking again will open detailed settings, where you can see similar elements and identify reasons for duplication.

If the site is too large and the search will take a long time, you can increase the similarity threshold. To do this, select “Crawl Analytics” in the menu, then click “Configure” in the drop-down list.

Check that the checkbox next to “Content” is checked so that content analysis is active. If the checkbox is missing, check it and click “Start” to continue.

The results can be checked in the “Near Duplicates” section. In this case, only those pages that have a higher degree of similarity will be displayed.

How to disable taking into account a block when searching for duplicates

The reasons for duplicate pages are often related to the presence of identical elements on many pages, such as the site header or blocks with popular products. To check only unique pieces of content, go to the "Content" section and select "Area" from the drop-down list.

In the window that opens, you will see two modes - “Include” and “Exclude”, where you can configure the search by tags, classes or block IDs.

For example, to exclude a block with popular products, insert the class or tag of this block and select the “Exclude” option.

Now go back to the Crawl Analytics section and check the analysis results. If all parameters are configured correctly, excluded blocks will no longer be taken into account when checking duplicates.

If you have any questions, do not hesitate to write to the SEO studio at info@seo.computer.

ID 6149

Send a request and we will provide a consultation on SEO promotion of your website