March 07, 2021

6 min read

Share this post :

The long question is how canonicalizing e-commerce pagination pages to the first page will affect SEO and Google’s crawl intent regarding products included in other pages? I prefer to simplify complex things, and I would like to explain what I mean by “canonicalization pagination pages” on e-commerce or other websites.

Let’s assume that your website has more than 2 pages on the product category:

  • The second or any later pages of the product category are usually indexable because the canonical tag shows the same URLs (with “?page=” parameter or else).
  • But we want to index just the first page of the product category. So we update the canonical tag of pagination pages to the first page.
  • In that case, when Googlebot will come to that kind of pages, it will remove them from the search results.

Example of pagination series on an e-commerce website that has canonicalized to the first page:

Canonical Tag on eCommerce Pagination Pages

CarrefourSA – The structure of the canonical tag of the second page. (The page was automatically translated into English)

Okay, so now you have some ideas on how it works.

If you would like to test it on the above website and are not located in Turkey, you need to activate the “Googlebot” user agent from network conditions and refresh the page.

I have seen many articles, even on Google’s official documentation, where that kind of method is not recommended at all. Someone doesn’t recommend making the canonical tag to the first page, and someone else says Google will not find products if the pagination pages are de-indexed, etc. I will not judge anybody for that; we all have different best practices for SEO and its methodology.

I’m here to tell you that – It’s fine to do canonicalization of pagination pages to the first page. I’ve worked on different large e-commerce websites that I did the same thing, and what do you think? None of them were impacted negatively; the exact opposite, there was a positive impact.

You may have some doubt regarding that method. That’s why in this blog post, I’m coming with some first-party data directly from the server-side. In the final part, I will explain to you why do I start to do this implementation.

Let’s make a fresh start.

The beginning of canonicalizing pagination series to the first page

The beginning of canonicalization pagination series

Google Search Console, Performance report

It was in December 2020. We started to de-indexing pagination pages by making the canonical tag to the first page of the categories. In the last 12 months, pages with the “?page=” URL parameter got over 4 million organic impressions and almost 40 thousands clicks. It was risky to make that change, but on SEO, there’s always challenging. Keep in mind the average position and difference between clicks and impressions of these pages.

Google started the process of removing tens of thousands of pagination pages from search results.

Google Search Console, Index Coverage, Alternate page with proper canonical tag - report

Google Search Console, Index Coverage report – Alternate page with proper canonical tag

After the implementation, Google started to show these pages in different reports on Index Coverage, which one of them is like above. Back then, before implementation, Google crawled over 30 thousands URLs that includes “?page=” parameter. Now, this number is 485 and soon, it will be 0. Keep in mind the number of indexable pagination pages.

There wasn’t any backlinks to pagination pages.

Google Search Console, Links, Top externally linked pages

Google Search Console, Links report – Top externally linked pages

You may say, “so what?” But it’s a bit critical because if there was backlinks for pagination pages, then the de-indexing process suppose to be longer than now. Sometimes good backlinks may have a big impact on the indexing process.

As pagination pages now is non-indexable, then we removed them from XML sitemaps as well.

DeepCrawl - Pages not in XML Sitemap

DeepCrawl report – Pages Not in Sitemaps

Some of the pagination pages were included in XML sitemaps. As we canonicalized them to the first page, they are non-indexable and must be removed from the sitemap. Keep in mind that the sitemap is a significant signal for Google to find your pages quickly.

Could Google be able to find product pages that are included in the pagination series?

The short answer is – Yes! But let’s look a bit deep into the situation.

  1. I exported all indexable product pages that has a “-p-” URL path. In total there’s 16,310 product pages.
  2. Imported them as URL data into the Screaming Frog Log Analyser.
  3. Requested from the client access log file for the last 1 week.
  4. Analyzed over 380,000 events of the access log file.

The results:

Screaming Frog Log Analyser - Access Log File Analysis

Screaming Frog Log Analyser – Results Access Log File Analysis

Although this result is for the last one week, 13,272 product URLs were found by Googlebot. You may ask, “but where’s the rest 3038 of the product pages?” That’s a great question. Googlebot has a limited crawl resource, and they don’t crawl every single page just in a day or a week. That’s why always we need to be patient with SEO processes.

What about the organic performance after the implementation?

Semrush - Organic Performance of product pages

Semrush, Organic Research – Product Pages

As this case is new-ish and didn’t share a case study yet, I can’t add the official organic performance data, but Semrush shows it pretty good estimated. Almost 2 years from now, I did the same thing with Carrefour’s competitor. You can check the results of the case study on DeepCrawl’s website.

How can Google find the product pages if they are included in non-indexable pagination pages?

Pretty simple:

  • Google crawled these product pages many times. So they have saved those URLs in their system*.
  • Product pages are already included in the XML sitemap. So Google can find it quickly.
  • Internal links works pretty good. And we still have to work on it.
  • Lots of product pages has backlinks.

* I’m not fully sure how Google’s system works, but Google knows even pages created 4-6 years ago.

Why do I recommend that method?

  • Googlebot will have more time to invest crawl-budget to other more important pages.
  • It’s very unlikely that pagination pages will appear in the top 10 search results. Like a book, we start to read it from the beginning to understand it better.

To be clear, I’m not saying go against Google’s guidelines. I’m saying that there are lots of things we need to test. Without testing, we can’t reach the results. Sometimes we lose, sometimes we win. That’s the reality.

Subscribe To Our Newsletter

Boostroas Boostroas

Roman Adamita

Director of SEO

Roman Adamita was instrumental in the formation of the team. His work is paramount in driving the SEO department’s success forward. He is the winner of the Young Search Professional of the year at the MENA Search Awards 2019!

You may also like

Boostroas Boostroas

Subscribe To Our Newsletter

Have a Growth Team