Duplicate page report
-
We ran a CSV spreadsheet of our crawl diagnostics related to duplicate URLS' after waiting 5 days with no response to how Rogerbot can be made to filter.
My IT lead tells me he thinks the label on the spreadsheet is showing “duplicate URLs”, and that is – literally – what the spreadsheet is showing.
It thinks that a database ID number is the only valid part of a URL. To replicate: Just filter the spreadsheet for any number that you see on the page. For example, filtering for 1793 gives us the following result:
|
URL
http://truthbook.com/faq/dsp_viewFAQ.cfm?faqID=1793
http://truthbook.com/index.cfm?linkID=1793
http://truthbook.com/index.cfm?linkID=1793&pf=true
http://www.truthbook.com/blogs/dsp_viewBlogEntry.cfm?blogentryID=1793
http://www.truthbook.com/index.cfm?linkID=1793
|
There are a couple of problems with the above:
1. It gives the www result, as well as the non-www result.
2. It is seeing the print version as a duplicate (&pf=true) but these are blocked from Google via the noindex header tag.
3. It thinks that different sections of the website with the same ID number the same thing (faq / blogs / pages)
In short: this particular report tell us nothing at all.
I am trying to get a perspective from someone at SEOMoz to determine if he is reading the result correctly or there is something he is missing?
Please help. Jim
-
Hi Jim!
Thanks for the question. One thing we should clarify before we move forward is that the Pro app doesn't actually report on duplicate URLs, but we do report when we find duplicate title tags or content.
Duplicate titles just refer to when we find the same title tag on more than one page. In one example from your diagnostics, we're reporting the title tag 'Truthbook Religious News' is being used in multiple pages (http://screencast.com/t/GYCKNfAoj).
Duplicate content is content we see on the source code of your pages that is identical or nearly identical and would cause the pages to compete against each other for rankings. To fix either of these you have a several options:
- Set up a 301 redirect to have the pages you would consider duplicate redirect to the main page.
- Change the content/title tags enough that they won't be considered duplicates - Canonicalize the content you would consider duplicates.
Most developers will go for the latter two options so that the pages will still be reachable by visitors. You can find out more about how to implement these in our Help Hub.
To answer your other questions:
1 - At the time of the crawl, we were able to get to sub domain pages from other pages on your site. The sub domains were also resolving separately, but they seem to be redirecting to your root domain now, so your next crawl should reflect this.
2 - Running a curl for the print versions of your pages, I see "no follow" tags related to Wikipedia links embedded (http://screencast.com/t/reYjeLLPvWG3) in the doc, but I'm not finding any "no index tags" (http://screencast.com/t/DsXMZInngSzH). This would be why you're seeing us crawling those pages.
3 - As I mentioned above, our crawler looks for similarities in the source code of pages when reporting on duplicate content. Since no one knows exactly how similar content would need to be for the search engines to consider it a duplicate, we err on the side of caution and recommended best practices when reporting them. Using one of the methods mentioned above and detailed in our Help Hub should resolve this for you
Let me know if you have any other questions!
Best,
Sam
Moz Helpster - Set up a 301 redirect to have the pages you would consider duplicate redirect to the main page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate pages coming from links from the login page - what should we do about them?
This is a follow on to an earlier question which was well answered by Dirk Ceuppens regarding abnormal crawl issues. We are seeing that the issues relating to Duplicate Pages are coming from links from the login page which shows information about where the user was redirected from. For example, if the visitor is not logged on and wishes to wish-list an item, they will be redirected to the login page, with the item code and intended action in the url; which can then continue on to the desired page once logged on. The MOZ crawler is seeing these pages as having Duplicated Content whilst they are all the same apart from a piece of information in the URL. Should we be blocking these duplications? Are they a risk to us? What should we be doing? Many thanks, Sarah
Moz Pro | | Mutatio_Digital0 -
How do you explain this in this Keyword Difficulty Report?
Hello here, I am trying to understand how my site virtualsheetmusic.com can better compete for the keyword "sheet music download" where we used to be 3rd until a few months ago, and now we are at the 5th spot. But here is my specific question: I can't find an explanation why two of our competitors are ranked before us at the 3rd and 4th spot. Please, look at the attached image (very wide) which shows the Keyword Difficulty Report for this keyword. You can also download a PDF of it here below: http://www.virtualsheetmusic.com/storage/keyword-sheet-music-download.pdf We are at the 5th spot, and the two competitors highlighted in pink are the ones I am talking about: I am trying to understand why they are ranking better than us, despite our metrics look much better than theirs. In fact, if I look at the metrics in the report, I can't find an explanation to justify their outrank against us, so there must be something else that the report is missing. Any thoughts about this issue are very welcome! Thank you in advance for any help. Sincerely,
Moz Pro | | fablau
Fabrizio [keyword-sheet music download.jpg](http://www.virtualsheetmusic.com/storage/keyword-sheet music download.jpg)0 -
On-Page Report Card Questions
First post here, a couple questions as I work through some of the SEOMOZ tool reporting, specifically the On-page Report Card. I've just received the report results yesterday, so working through the data now. There are two issues categorized as critical by the tool: (1) The grader is stating I don't have any instances of the target keyword in my page title, yet it's there. (The page title is too long, but I'm in the process of hacking the blog software to fix this, it's auto-generated by the CMS.) (2) It's also saying under "Broad Keyword Usage in Document" that I have zero instances of the keyword in the body text, and while I certainly don't have enough, there is at least one instance at the bottom of the blog post. All the text is contained with tags. (3) Related to #2, what's the difference between "Appropriate Keyword Usage in Document" under "High Importance Factors" and "Broad Keyword Usage in Document" under "Critical Factors"
Moz Pro | | webranger0 -
Duplicate page content on / and index.php
Hi I am new to SEOmoz and in the crawl diagnostics for one of my clients it came back duplicate content on the homepage www.myclient.co.uk and on the www.myclient.co.uk/index.php which is obviously the same page. I understand that the key is to do a 301 redirect from the index to /, however how will I know that this will not just create an ever ending loop on the server? From your experience how is the best way to tackle this crawl error? Also is there a specific question that I need to ask the server?
Moz Pro | | search_shop0 -
Question About Moz Reports
Hey guys Is it possible to change the date/time that SEOmoz runs its ranking reports? I put together ranking reports each Monday which I need to feedback to the business and it would be good if I could schedule these for late Sunday or early Monday mornings for me coming in to the office.
Moz Pro | | EwanFisher0 -
Duplicate Page Content
i getting crewl errors on Duplicate Page Title and content for the same page. www.breeze-air.com www.breeze-air.com/ www.breeze-air.com/index-html what am i doing worng? please help thank you
Moz Pro | | eoberlender0 -
SEOmoz ranking report SERP
I was wondering how SEOmoz the SERPs tracks? e.g. the SERP of keyword in the Google US report, doesn't give me the same result as https://www.google.com/search?pws=0&gl=us&q=keyword Does SEOmoz check the google.com SERPs from several locations and calculate an average?
Moz Pro | | Teklan0 -
One page per campaign?
Not quite sure if I read correctly, but is it correct that one campaign tracks only one page of my site? So if I wanted to track something like a services page, this would require a second campaign?
Moz Pro | | GroundFloorSEO0