Avoid Duplicate Content: An SEO Best Practice for Internet Retailers

Online retailers rely on organic search traffic to attract the type of consumers that have a high conversion rate. Search engines use advanced algorithms to score sites in order to determine which site is most relevant for a given query. As the organic search landscape is constantly evolving, even well-performing sites can experience traffic loss when search engines release an unexpected algorithm update.

Duplicate content is the term used to describe blocks of content that appears on multiple web pages. It can be intentionally created as part of the site’s template or boilerplate information, or unintentionally created as a result of site architecture. While search engines handle duplicate content slightly differently, they all agree that it is not what they want to present on their results page. Simply put, multiple search listings with identical information does not deliver the desired user experience for any organic search query.

Over the years, Google has taken the strongest position on the topic of duplicate content. Throughout the last decade, SEOs theorized about a “duplicate content penalty” and the level of uniqueness needed to avoid it. In early 2011, those theories became reality as Google released an algorithm update that specifically addressed duplicate content. The Panda update took the SEO world by storm as several well-known sites suffered ranking losses, a result of publishing duplicate or thin content.

Retailers have unique challenges with regard to duplicate content. The first is a result of the architecture of retails sites that allow users to sort and display products in a number of ways, such as size or color. The second challenge is the sheer volume of products listed in each category. Finally, many times a retailer is limited to manufacturer descriptions of a particular product. Since products are often found across many sites online, this information is often duplicated across many websites. Luckily, there are several SEO best practices designed to handle these types of duplicate content.

Page Canonicalization
Using a supported set of HTML tags, called canonical tags, webmasters can indicate to search engines which page is the preferred version of pages with highly similar content. This tag is extremely valuable for sites that allow sorting of items through URL parameters or use parameters for visitor tracking. Canonical tags are placed across all pages of a website that may be reached through multiple URLs. In a case study performed by one online retailer, correct implementation of canonical tagging resulted in a 10% increase in tracked keywords on Google page 1 after four weeks.

Pagination Canonicalization
Product pagination is a common way to divide multiple products in a category onto discrete pages. It is necessary to reduce page load time and to organize content for users. However, the unintended result is often several very similar pages optimized for an identical set of keywords. Organic search visitors can enter the site on any level of the pagination, which might present an unwanted user experience as less popular products are often listed deeper in the paginated results. As the content is not identically duplicated, typical canonicalization tags are not appropriate for this use. Instead, Google has added support for pagination tags. Pagination tags create an HTML link between ordered pages that allow search engines to understand the site pagination. They also work to consolidate acquired link value to the first page in the series and, in turn, send a hint that that is the preferred page to use for a search result listing.

User Generated Content
Finally, one of the most difficult duplicate content situations for retailers to address is the actual product content. Not only are content snippets often duplicated throughout category and subcategory pages to aid users in selecting the right product, but content is often duplicated across all retailers that sell that particular product. One of the most effective ways to combat this is to add unique, relevant content to the page that dilutes the duplicate across pages. As writing unique content for each product can be time consuming, user generated content is often looked to in order to provide very relevant content that is unique to each product. User generated content, often implemented on product pages in the form of user reviews or question & answers, has also been shown to increase conversions. A 2009 case study by one online retailer found that review readers and writers convert 82% higher and customers who interacted with online Q&A converted 58% higher.

Looking Ahead
With 24 iterations of the Panda algorithm released over the past two years, it is clear that Google continues to refine and perfect automatic detection of duplicate content. With careful consideration of these best practice SEO techniques, retails can protect themselves against unintentionally duplicating content and risking a resulting search engine penalty.

Guest contributor Christi Hart, Director, Client Services, Merkle

Join the Discussion