Why Duplicate Content Is Bad for the Web

Duplicate doors

When you create duplicate web content, do your users know which source is most relevant and accurate?

Among the countless content problems found on websites, one of the most egregious and commonly cited is duplicate content.

Content strategists — not to mention search engine optimization (SEO) professionals — love to hate on duplicate content. As soon as we spot it during a content audit, we quickly bring out our metaphorical red pens. Begone, duplicate content!

But why?

For those of us on the other end of the table, it’s easy to dismiss duplicate web content as a minor problem — like a misplaced comma or a run-on sentence — when, in reality, duplicate web content can impact findability, usability and user comprehension.

Yet, most instances of duplicate content exist to improve findability, not hinder it. So, how can publishing the same content in more places impact web users’ ability to find and use it?

Let’s dig into why duplicate content is bad for the web, as well as when it might be okay.

Reason #1: Search engines and irrelevant content

Web professionals usually discuss duplicate content within the context of search engine optimization (SEO). Duplicate content is a problem for SEO because it confuses search engines about what content is relevant and how pages should be prioritized in search results.

Since, typically, the biggest source of web traffic comes from organic search results, we don’t want search engines discounting our content or marking it irrelevant. That’s bad news.

You can easily see the impact of duplicate content by searching for information on your own website. In a recent post on page titles and headers, we saw the effects of duplicate titles in search results. But duplicate-content issues extend beyond just page titles.

Contrary to popular belief, duplicate content on the web does not automatically equal spam. In other words, Google won’t penalize you in search results just because you have duplicate content. Matt Cutts, the head of Google’s Webspam team, recently discussed this topic in response to the question, "How does Google handle duplicate content?"

Cutts explains that Google expects duplicate content in its many forms; in fact, he assumes 25% to 30% of all web content is duplicated. (Boy, that’s depressing.)

But that explanation doesn’t tell the whole story. Because, while duplicate content may not affect your search rankings, it does impact what content appears in search results — and, in turn, the quality and relevance of your content in these results.

When you create duplicate content, you’re asking search engines to determine which piece of duplicate content is most appropriate and relevant. And that may not benefit you or your users.

Reason #2: Findability and confused users

For the same reason that duplicate content confuses search engines, so does it confuse web users. When web users are faced with duplicate content, it can confuse them about what content is most important and where they should look for specific information.

For example, maybe you have a student advising center webpage. And in an attempt to increase findability, you create separate pages called “Student Advising” in every academic section, reiterating available student services information. This duplicate content may increase the chance that the content will be found, but once students discover this content in multiple locations, how will they know which webpage is the proper source?

Duplicate content causes web users to question what content is most relevant and useful.

When people search for information, they want reassurance they are in the right place. Duplicate content causes mistrust and hinders users’ ability to use information reliably.

Content is made relevant by the context in which users perceive it. When you duplicate content on your website, you alter the meaning and relevance of that content. Placing content in unexpected or unintuitive locations not only affects whether people will find it but can cause readers to misinterpret it.

Publishing content in the right spot — organizing and prioritizing content in an intuitive way — is tough, no doubt. That’s why information architecture is a distinct discipline. Information architecture works to organize and prioritize content to increase findability and enhance usability. This enables web users to easily find, discover and use content.

If you feel the need to create duplicate content, your website information architecture is likely ineffective because you’re having to make up for an unintuitive navigation scheme. We should strive to make content findable without duplicating it.

Reason #3: Content governance and inaccurate information

As if content findability and usability weren’t enough, often the biggest problem with duplicate content is its impact on content governance and inconsistent, inaccurate information.

Let’s return to our student advising content example. If your student advising web content is duplicated in academic department sections, who is responsible for updating all those duplicate webpages? The Student Advising Center? Each academic department staff assistant? If you’re like many institutions, the answer is no one. And, as with many cases of duplicate content, not all content is updated consistently to reflect the same information.

While some inconsistencies may be trivial, others can have a significant impact. Consider the ramifications of outdated tuition information or application deadlines. I certainly wouldn’t want to be the one to answer the phone by an angry student who discovered she owed more tuition than she budgeted and missed a recent financial aid application deadline.

Duplicate content can result in inconsistent and inaccurate information.

If by duplicating content we cause misinformation, any possible benefits of duplicating content quickly become undone. In fact, we’ve made things worse.

The challenge of managing duplicate content impacts web users as well as content owners. In most cases, content owners are challenged enough by keeping their unique content up-to-date, let alone managing duplicate content.

Is all duplicate content bad?

Webages, yes. Content, no. While a duplicate webpage is always bad news, there are times when duplicate content may be appropriate — particularly with short descriptions and calls to action.

The questions of SEO, findability and usability come into play when you are duplicating large pieces of content that may confuse relevance. Duplicating an entire webpage certainly qualifies but so does a video, photograph, paragraph or even a sentence, if the context is altered and confused.

The problem of duplicate content is not about length; it’s about meaning. If duplicate content hinders — rather than enhances — relevance and user comprehension, then it is bad.

As content professionals, the goal is not to create unique content but to create useful, usable, findable, on-brand content. If content meets these or other defined goals, then the question of duplicate content is irrelevant. This advice is not a free pass to publish duplicate content but is meant to help keep your eyes on the prize.

At the very least, identifying duplicate content should raise a red flag.

What types of duplicate content do you see on your website? What problems, if any, do you think duplicate content causes web content owners and users?

Photo by Craig Rodway / Flickr Creative Commons

  • RSS
  • LinkedIn
  • Twitter
  • Facebook
About Rick Allen

Rick Allen has worked in higher education for over twelve years, helping to shape web communications and content strategy. As principal of ePublish Media, Inc, a web publishing and content strategy consultancy in Boston, Mass., Rick works with knowledge-centric organizations to create and sustain effective web content. Keep going »

Comments

  1. The WORST type of duplicate content — and a problem that we continue to struggle with where I work — are the “fast facts” or “bragging points” tidbits of content that are typically lifted from a print piece then rarely if ever updated. The info is then inconsistent with other “fast facts” sites on the university web, creating the confusion you discuss in this post.

  2. I wanted to chime in because I’ve looked into the question of near-duplicate content for the case of Adaptive Content. The phrase has many definitions; here, I view it as content that is conditionally modified at run time, either to adapt it to smaller devices by selectively excising wordy bits, or to adapt it to user preferences or characteristics (say the referring URL indicates they linked from a management resource, so they get a custom insert).

    I’ve read Matt Cutt’s article closely from the viewpoint of how they handle dynamic content, and I’m satisfied that they are really striving to reward getting the right information to the right person via dynamic content modification. This may apply more to training or procedural material (think experience level or role) than to educational material. My point is just that use cases for apparently duplicate content can actually become more wide ranging as dynamically adapted delivery grows in popularity. I’m not close to suggesting any best practices yet; for now it will be good if in fact Google really IS making adjustments for the coming wave of adapted (and therefore seemingly overlapping) content.

  3. Hi,
    I’ve a question about duplicate content from an other page on my own blog (the same).
    Is there any problem with this kind of duplication…?

Trackbacks

  1. Web Publishers – Book Club

    From the web: A List Apart: The Core Model – http:

  2. […] Certifique-se que as páginas que você está linkando sejam únicas e úteis. Isso é realmente importante. O Google odeia conteúdo duplicado. […]

  3. […] Stell sicher, dass die Seiten, auf die du verlinkst, einzigartig und hilfreich sind. Das ist wirklich wichtig. Google mag Duplicate Content nicht. […]

What do you think?

*