Duplicate content warnings
-
I have a ton of duplicate content warnings for my site poker-coaching.net, but I can't see where there are duplicate URLs. I cannot find any function where I could check the original URL vs a list of other URLs where the duplicate content is?
-
thanks for the help. I am trying to cover all bases here. Duplicate content was one concern, the other one is too high link density and bad incoming links.
I have downloaded a full backlinks report now using Majestic SEO (OSE only shows incoming links from 74 domains...).
I think I may have found the problem. I used to have a forum on that domain years ago which was hacked and used for a lot of spammy outgoing links for stuff like Cialis, Viagra etc.. Those guys also linked from other sites to my forum pages. Example:| from: http://www.grupoibira.com/foro/viewto...
| Anchor: buy flagyl in lincolnu... | 3 | to: http://www.poker-coaching.net/phpbb3/...
|
When I closed the forum and deleted the forum plugin, I redirected all forum pages to my main page which, under the circumstances was a bad mistake I guess. Because with the redirect, all those spammy links now end up pointing to my main page, right? So first, I have removed that redirect now.
But the problem remains that I still have plenty of links from spam sites pointing to URLs of my domain that do not exist any more.
Is there anything else I can do to remove those links or have Google remove/disregard them, or do you think a reconsideration request explaining the situation would help? -
Honestly, with only 235 indexed pages, it's pretty doubtful that duplicate content caused you an outright penalty (such as being hit with Panda). Given your industry, it's much more likely you've got a link-based penalty or link quality issue in play.
You do have a chunk of spammy blog comments and some low-value article marketing, for example:
http://undertheinfluence.nationaljournal.com/2010/02/summit-attendees.php
A bit of that is fine (and happens in your industry a lot), but when it's too much of your link profile too soon, you could be getting yourself into penalty territory.
-
Hey There,
Just to clarify, to see the source of those errors, you’ll need to download your Full Crawl Diagnostics CSV and open it up in something like Excel. In the first column, perform a search for the URL of the page you are looking for. When you find the correct row, look in the last column labeled referrer. This tells you the referral URL of the page where our crawlers first found the target URL. You can then visit this URL to find the source of your errors. If you need more help with that, check out this link: http://seomoz.zendesk.com/entries/20895376-crawl-diagnostics-csv
Hope that helps! I will look at the issue on the back end to see if they are actually duplicate content.
Have a great day,
Nick
-
Thanks for looking into this. Actually I checked the whole site by doing a batch search on Copyscape and there were only minor duplicate content issues. I resolved those by editing the content parts in question (on February 24th 2012).
Since I am desperately searching for the reasons why this site was penalized (and it def is...), it would be great to know why your duplicate content checker finds errors. Could only be related to multiple versions of one page on different URLs. I do have all http://mysitedotcom redirected to www.mysitedotcom, and the trailing slash/notrailingslash URL problem was also resolved by a redirect long ago, so I do not know where the problem lies.
Thanks for the help! -
I think our system has roughly a 90-95% threshold for duplicate content. The pages I'm seeing in your campaign don't look that high, so something is up - I'm checking with support.
For now, use the "Duplicate Page Title" section - that'll tend to give you exact duplicates. The duplicate content detection also covers thin content and near duplicates.
-
Yes that is what I first thought too. If only it were that easy.
But when I do, I see a couple of URLs that definitely do not have any duplicate content . Could it be that the dupe content check considers text in sitewide modules (like the modules "Poker News" and "Tips for ...." in www.poker-coaching.net) as duplicate content, because they appear on all pages?
This way, the duplicate content finding function is totally worthless. -
If you drill down into your campaign report into 'Crawl Diagnostics' you will see a dropdown menu that's named "Show". Select 'Duplicate Page Content'... you will see a graph with a table below it. To the right of the URL you will see a column named "Other URL's". The numbers in that column are live links to a page with the list of URL's with duplicate content. At least that is how it is displayed in my campaigns.
-
You will find this information at google webmaster tools and at seomoz campaing. There you will the information you need.
One easy way to avoid this is to include the rel canonical metag. You need to include in every page (pages you want to be the official one) inside the head tag the follow:
where ww.example.com/index.html is your page adress. Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO Effect of inserting No indexed Contents in normal Pages (Nextgen Gallery)
Hello Dear Community, I'm running a photography website and have a question about the indexability of "No indexed Content" inserted on indexable pages. Background : I read everywhere that best practice is to "no index" all useless pages with few content, what I did with Yoast plugin : I no indexed all my nextgen galleries and "ngg_tags" since they create single pages for every photo, tags or slideshow. I did the same for all my porfolio-posts, price lists, testimonials and so on... Nevertheless, I inserted these galleries and portfolios on SEO optimized page for my target keywords. + Nextgen plugin automatically adds these images in the page sitemap. My idea is to have only my Seo optimized page showing in Google and not the others. Problem: I've been checking the results in Google Search Console, filtering by images : I discovered that most of the images featured in these Masonry galleries are not showing in google, and actually almost all the images indexed are the Wordpress from media gallery. I double checked with Screaming Frog, and the software doesn"t see images on these pages. My question is: Is the low indexablilty of these contents are related to the No indexation of the original contents ??? Does somebody has experienced the same issue that these contents doesn't show on Google ? in advance many thanks for your help
Reporting & Analytics | | TristanAventure0 -
Hello, our domain authority dropped significantly overnight from 37 to 29\. We have been building good links from high DA pages and producing quality, regular content.
Hello, our domain authority dropped significantly overnight from 37 to 29. We have been building good links from high DA sites and producing regular, good quality content. Anyone able to offer any ideas why? Thanks
Reporting & Analytics | | ProMOZ1231 -
404 errors more than 1.8 lacs, Duplicate Content, Duplicate title, missing meta description increasing as site is based on regular ticket selling (CRM), kindly help
Sites error increasing i.e. 404 errors more than 1.8 lacs, Duplicate Content, Duplicate title, missing meta description increasing day by day as site is based on regular ticket selling (CRM), We have checked with webmasters for 404's, but it is not easy to delete 1.8 lac entries. How to resolve this issue for future. kindly help and suggest the solution.
Reporting & Analytics | | 1akal0 -
How do I fix apparent duplicates
I'm auditing a site and would appreciate your help with possible explanations and solutions as to why Google Analytics in the Content Drilldown page is showing what appears to be duplicate pages. (Refer image) I'm wondering if I have got my head around the rel=canonical tag because the page I'd consider a duplicate "page/" has a Canonical tag pointing to "~/page.html" This is the tag from the page Locations/ rel="canonical" href="http://www.domain.com/Locations.html" /> so am unsure why both versions of the page are generating views. Shouldn't the Canonical tag work like a 301 redirect? I'm unsure how the pages using the path page/ are generating so many views because I have not been able to find them and they are not indexed by Google. Unfortunately the site is built using a Propriety CMS I'm not familiar with. exK4EqrU25
Reporting & Analytics | | NicDale0 -
Duplicate Page Title
I'm new to SEO and have just signed up to SEOMOZ to see what I can learn. I got the report back on my site and it indicates various errors, one of them being Duplicate Page Title - I have a blog on my site and a lot of pages identified as with duplicates are like this: http://www.martinspencephotography.co.uk/blog?page=2 Is it important I rectify this? Do I need to rectify it?
Reporting & Analytics | | MartinSpence460 -
Duplicate Url with Google shopping feed
In webmaster tool I have many duplicate url tagged as google_shopping Obviously i'm tagging the url with the goog url builder Url: elettrodomestici.yeppon.it/cura-corpo/tagliacapelli/remington-tagliacapelli-funzionamento-rete-ricaricabile-lame-in-acciaio-inox-hc5150-garanzia/ Duplicate url: elettrodomestici.yeppon.it/cura-corpo/tagliacapelli/remington-tagliacapelli-funzionamento-rete-ricaricabile-lame-in-acciaio-inox-hc5150-garanzia/?utm_source=google_shopping&utm_medium=web&utm_content=Elettrodomestici+e+Clima+%3E+Cura+del+corpo+%3E+Tagliacapelli&utm_campaign=google_shopping How can I solve it? Thanks
Reporting & Analytics | | yeppon0 -
Time until duplicate penalty is lifted?
Hello, I recently discovered that half of the pages on my site, about 3,500 were not being indexed or were indexing very very slow and with a heavy weight on them. I discovered the problem in the "HTML Suggestions" within Google's Webmaster Tools. An example of my main issue. All 3 of these URL were showing 200 Status OK in Google. www.getrightmusic.com/mixtape/post/ludacris_1_21_gigawatts_back_to_the_first_time www.getrightmusic.com/mixtape/post/ludacris_1_21_gigawatts_back_to_the_first_time/ www.getrightmusic.com/mixtape/ludacris_1_21_gigawatts_back_to_the_first_time I added some code to the .htaccess in order to remove the trailing slashes across the board. I also properly set up my 404 redirects, which were not properly set up by my developer (when the site "relaunched" 6 months ago 😞 ) I then added the Canonical link rel tags on the site posts/entries. I'm hoping I followed all the correct steps in fixing the issue and now, I guess, I just have to wait until the penalty gets lifted? I'm also not %100 certain that I have been penalized. I'm just assuming based on the SERP ceiling I feel and the super slow or lack of indexing my content. Any insight, help or comments would be super helpful. Thank you. Jesse
Reporting & Analytics | | getrightmusic0 -
Campaign tracking and duplicate content
Hi all, When you set up campaign tracking in Google Analytics you get something like this "?variable=value parameters" in the URL. If you place such a link on your site as an internal link, will it be considered as a different URL and will have its own link value? The question I have is, since Google knows it's a Google link and knows the original URL (by stripping the tags), does it pass link value to the original URL? If not, what can be done to pass link value? Thanks in advance. Henry
Reporting & Analytics | | hnydnn0