How do I use public content without being penalized for duplication?
-
The NHTSA produces a list of all recalls for automobiles. In their "terms of use" it states that the information can be copied. I want to add that to our site, so there is an up-to-date list for our audience to see. However, I'm just copying and pasting. I'm allowed to according to NHTSA, but google will probably flag it right? Is there a way to do this without being penalized?
Thanks,
Ruben
-
I didn't think about other sites, but that's a fabulous point. Best to play it safe.
Thanks for your input!
- Ruben
-
My gut says that your idea to keep the content noindexed is best. Even if the content is unique it borders on the area of auto-generated. Now, I might change your mind if there were a lot of users that were actively interacting with most of these pages. If not though, then you'll end up having a large portion of your site consisting of auto-generated content that doesn't seem useful. Plus...it's also possible that other sites are using this information so you would end up having content that is duplicated on other sites too.
I could be wrong, but my gut says not to try to use this content for ranking purposes.
-
I appreciate the follow up, Marie. Please give me your thoughts on the following idea;
The NHTSA only posts the updates for the past month. If I noindex the page for now (which is what I'm doing) and wait five months, then what would happen? At that point, yes, the current month would be duplicated, but I'd have four months of "unique" content because the NHTSA deletes there's. Plus, I could add pictures of all the automobile, too. Do you think that would be enough to index it?
(I'm most likely going to keep it noindex, because this borders on shady, or at least, I could see google taking it that way, but just as a thought experiment, what do you think?) Or anyone else?
Thanks,
Ruben
-
To expand on EGOL's answer, if you are taking someone else's content (even with their permission) and wanting Google to index it then Google can see that you have a large amount of copied content on your site. This can trigger the Panda filter and can cause Google to consider your whole site as low quality.
You can add a noindex tag as EGOL suggested or you could use a canonical tag to show Google who the originator of the content is, but probably the noindex tag is easiest.
There is one other option as well. If you think it is possible that you can add significant value to the content that is being provided then you can still keep it indexed. If you can combine the recall information with other valuable information then that might be ok to index. But, you have to truly be providing value, not just padding the page with words to make it look unique.
-
Alright, that sounds good. Thanks!
-
I had a bunch of republished articles on my website, done mostly at the request of government agencies and universities. That site got hit in one of the early Panda updates. So, I deleted a lot of that content and to the rest I added this line above the tag
name="robots" content="noindex, follow" />
That tells google not to index the page but to follow the links and allow pagerank to flow through. My site recovered a few weeks later.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are feeds bad for duplicate content?
One of my clients has been invited to feature his blog posts here https://app.mindsettlers.com/. Here is an example of what his author page would look like: https://app.mindsettlers.com/author/6rs0WXbbqwqsgEO0sWuIQU. I like that he would get the exposure however I am concerned about duplicate content with the feed. If he has a canonical tag on each blog post to itself, would that be sufficient for the search engines? Is there something else that could be done? Or should he decline? Would love your thoughts! Thanks.
Intermediate & Advanced SEO | | cindyt-17038
Cindy T.0 -
Buying a disused website and using their content - penalty risk?
Hi all, I'm in the process of setting up a new website. I have found various old websites covering a similar topic and I'm interested in purchasing two of these websites for their content as it is very good, despite those sites struggling to make ends meet. One of these websites is still live, the other one hasn't been live for 2 years. Let's say I bought these websites for their content, then used that content on my new domain and made sure the two websites where this content came from were offline, would I run a risk of getting penalised? Does Google hold onto content from a website even if it is now offline?
Intermediate & Advanced SEO | | Bee1590 -
Duplicate Content For Product Alternative listing
Hi I have a tricky one here. cloudswave is a directory of products and we are launching new pages called Alternatives to Product X This page displays 10 products that are an alternative to product X (Page A) Lets say now you want to have the alternatives to a similar product within the same industry, product Y (Page B), you will have 10 product alternatives, but this page will be almost identical to Page A as the products are in similar and in the same industry. Maybe one to two products will differ in the 2 listings. Now even SEO tags are different, aren't those two pages considered duplicate content? What are your suggestions to avoid this problem? thank you guys
Intermediate & Advanced SEO | | RSedrati0 -
Multiply domains and duplicate content confusion
I've just found out that a client has multiple domains which are being indexed by google and so leading me to worry that they will be penalised for duplicate content. Wondered if anyone could confirm a) are we likely to be penalised? and b) what should we do about it? (i'm thinking just 301 redirect each domain to the main www.clientdomain.com...?). Actual domain = www.clientdomain.com But these also exist: www.hostmastr.clientdomain.com www.pop.clientdomain.com www.subscribers.clientdomain.com www.www2.clientdomain.com www.wwwww.clientdomain.com ps I have NO idea how/why all these domains exist I really appreciate any expertise on this issue, many thanks!
Intermediate & Advanced SEO | | bisibee10 -
Magento Duplicate Content Recovery
Hi, we switched platforms to Magento last year. Since then our SERPS rankings have declined considerably (no sudden drop on any Panda/Penguin date lines). After investigating, it appeared we neglected to No index, follow all our filter pages and our total indexed pages rose sevenfold in a matter of weeks. We have since fixed the no index issue and the pages indexed are now below what we had pre switch to Magento. We've seen some positive results in the last week. Any ideas when/if our rankings will return? Thanks!
Intermediate & Advanced SEO | | Jonnygeeuk0 -
How can I remove duplicate content & titles from my site?
Without knowing I created multiple URLs to the same page destinations on my website. My ranking is poor and I need to fix this problem quickly. My web host doesn't understand the problem!!! How can I use canonical tags? Can somebody help, please.
Intermediate & Advanced SEO | | ZoeAlexander0 -
Should I do something about this duplicate content? If so, what?
On our real estate site we have our office listings displayed. The listings are generated from a scraping script that I wrote. As such, all of our listings have the exact same description snippet as every other agent in our office. The rest of the page consists of site-wide sidebars and a contact form. The title of the page is the address of the house and so is the H1 tag. Manually changing the descriptions is not an option. Do you think it would help to have some randomly generated stuff on the page such as "similar listings"? Any other ideas? Thanks!
Intermediate & Advanced SEO | | MarieHaynes0 -
Mobile version creating duplicate content
Hi We have a mobile site which is a subfolder within our site. Therefore our desktop site is www.mysite.com and the mobile version is www.mysite.com/m/. All URL's for specific pages are the same with the exception of /m/ in them for the mobile version. The mobile version has the specific user agent detection capabilities. I never saw this as being duplicate content initially as I did some research and found the following links
Intermediate & Advanced SEO | | peterkn
http://www.youtube.com/watch?v=mY9h3G8Lv4k
http://searchengineland.com/dont-penalize-yourself-mobile-sites-are-not-duplicate-content-40380
http://www.seroundtable.com/archives/022109.html What I am finding now is that when I look into Google Webmaster Tools, Google shows that there are 2 pages with the same Page title and therefore Im concerned if Google sees this as duplicate content. The reason why the page title and meta description is the same is simply because the content on the 2 verrsions are the exact same. Only layout changes due to handheld specific browsing. Are there any speficific precausions I could take or best practices to ensure that Google does not see the mobile pages as duplicates of the desktop pages Does anyone know solid best practices to achieve maximum results for running an idential mobile version of your main site?1