The truth about duplicate content

There are many myths and rumours surrounding duplicate content. In this post we’ll separate the fact from the fiction in terms of its potential impact on your website.

What is ‘duplicate content’?

Where better to start than with Google’s definition?

“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin. Examples of non-malicious duplicate content could include:

• Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices
• Store items shown or linked via multiple distinct URLs
• Printer-only versions of web pages”

There is a common misconception that any amount of duplication on a web page is a bad thing – this is not the case. Generally, some duplicate content is OK if the source is credited, adds value and it doesn’t make up a disproportionate amount of your content.

The quoted copy above is a good example of this, we are not passing this off as our own opinion, we are very clearly referencing and linking to the original source.

So what’s wrong with duplicate content?

Problems with duplicate content arise if its intent is seen as malicious. Once upon a time, black hat SEO practitioners could copy content across sites to manipulate search engine rankings. But algorithms are much smarter now, so having a large amount of duplicate content on your site will do more hard than good.

Also, if your site has a significant amount of duplicate content, search engines will have the following problems:

– They won’t know which version(s) to show in search results – and what order to rank them in.
– They won’t know which version of the content to include/exclude from their results.
– With internal duplication in particular, search engines won’t know if they should direct the link metrics to one page, or keep it separated between multiple versions – essentially diluting the ‘link juice’. But if the content is on only one URL, each link will point to that single page, enhancing its authority.

Can you be penalised by Google for using duplicate content?

There’s a common myth floating around that you can receive a formal penalty for duplicate content. However, in a recent video, Lipattsev was adamant that if Google discovers your site’s content isn’t unique and doesn’t rank your page above a competing page, it isn’t a penalty – it’s simply Google trying to give the end user the best experience. Depending on the search terms and the quality of your content, your page containing duplicate content could appear higher in another relevant search.

In the following video, Cutts makes it clear that duplicate content won’t raise a red flag with the search engine giant, unless it is spammy or involves keyword stuffing.

Although you may not be penalised by Google for duplicate content specifically, there are issues surrounding duplication which can hurt your rankings – namely the three points mentioned earlier.

Google and the other search engines love uniqueness, added value and high quality content, so sites providing this will be rewarded, while sites providing a high amount of copied content won’t be.

What about plagiarism?

Content scraping is not protected by copyright law if the person who’s using the content on their site gives credit to the original source. However, if an acknowledgement of the source is not included, this is classed as plagiarism; if you’re the victim, you could file a Digital Millennium Copyright complaint against the person who has stolen your content. Take a look at this real-life story of website plagiarism, including steps you might want to take if you’re in a similar situation.

How can you avoid duplicate content issues?

Although duplicate content may not be as deadly as many people believe, it’s still important to take steps to minimise its negative effects on your site. As a first step, tools such as Siteliner and Copyscape can help you to discover any obvious issues. You’ll find lots of helpful, up-to-date tips from the folks at Hobo Web and if you have an ecommerce site, US agency Inflow have also produced a handy guide.

If your website contains a lot of internal duplication, which is particularly common on ecommerce sites, you should indicate preferred URLs to Google via Canonicalisation.

Image courtesy of Andrew Mager.

One thought on “The truth about duplicate content

Leave a Reply

Your email address will not be published. Required fields are marked *