Duplicate Content Report: Duplicate URLs being crawled with "++" at the end
-
Hi,
In our Moz report over the past few weeks I've noticed some duplicate URLs appearing like the following:
Original (valid) URL:
http://www.paperstone.co.uk/cat_553-616_Office-Pins-Clips-and-Bands.aspx?filter_colour=Green
Duplicate URL:
http://www.paperstone.co.uk/cat_553-616_Office-Pins-Clips-and-Bands.aspx?filter_colour=Green**++**
These aren't appearing in Webmaster Tools, or in a Screaming Frog crawl of our site so I'm wondering if this is a bug with the Moz crawler? I realise that it could be resolved using a canonical reference, or performing a 301 from the duplicate to the canonical URL but I'd like to find out what's causing it and whether anyone else was experiencing the same problem.
Thanks,
George
-
So glad to help, George!
-
Hi Chiaryn,
Thanks - you've been really helpful! I had assumed that as the referrer wasn't in the Web UI (per WMT), it wasn't available anywhere. I'd also assumed it was a copywriting issue and not a product data issue.
Need to readdress my assumptions
George
-
Hey George,
Thanks for writing in.
I looked into the pages with the ++ in the URL and it seems that they do actually exist on the site, so it isn't an issue with our crawler that is causing these in your crawl errors. For example, a link to the URL http://www.paperstone.co.uk/cat_553_Desktop-Essentials.aspx?filter_colour=Green++ can be found in the source code of the page http://www.paperstone.co.uk/cat_553_Desktop-Essentials.aspx here: http://screencast.com/t/HpHTlSs5gH8H
You can find the referral pages for the ++ pages on the site by downloading the Full Crawl Diagnostics CSV. In the first column, perform a search for the ++. When you find the correct row, look in the column labeled referrer, AM. This tells you the referral URL of the page where our crawlers first found the URLs that include ++. You can then visit this URL to find the links to those pages.
Since these URLs with the ++ do resolve with a 200 http status and they have the same code and content as the pages without the ++, our crawler will count them as duplicate content. I'm not certain why Screaming Frog and GWT are not find or reporting these pages; it may be that they parse the + signs in the URL differently than our crawler does.
As Keri and bishop23 mentioned, this is most likely not a major issue if GWT isn't reporting the errors, but we prefer to report the issues because we would rather be safe than sorry.
I hope this helps. Please let me know if you have any other questions.
Chiaryn
-
I'm not seeing an answer that jumps out at me for this one. For the immediate future, don't sweat it if you're not seeing it in GWT. This is assigned to our help desk, and we'll have someone from there investigate more and get back to you, though it might be a few days because of the Thanksgiving holiday (if you don't get an answer today, it may be Monday before we have a chance to respond).
-
If they're not appearing on WMT than you should ignore unless it's an exact duplicated content, then delete
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Solved Site Crawl Won't Complete
How can I start/restart a new site crawl? I requested one 2 days ago on one of my sites, and it won't complete. It's only 150 pages -
Product Support | | PaulBarrs0 -
Moz not tracking search traffic & not displaying correctly in pdf report
Hi For one of our clients search traffic has stopped tracking and there are no error msgs. For the same client, when you save the report to pdf format, it appears as per below?
Product Support | | Nikki9590 -
Still no invite to site crawl beta! Why bother?
Well, I was informed that I was en-queue to be invited to the Moz Site Crawl v2. I have several client sites making use of SNI b/c, well... CDN's. What is the point of telling me I may receive an invitation shortly, then hearing nothing back and not being able to crawl their sites... this makes this service 100% useless as I can simply use a couple of different tools (free) to perform the same tasks... don't get me wrong... I would rather use Moz and this is not intended to flame the service as I think it could be great... if only it worked. I cannot justify the lack of response, nor the lack of service (what we intended to use here) for the price. It seems like this is simply a waiting game wherein Moz expects me to pay for this service and THEN I will receive my invite? Is it at all possible that anyone can look into this and/or my invite status. If I cannot sample these features before long, you've lost a solid potential client. (Not my loss)
Product Support | | jmsdonline0 -
Moz has stopped crawling my site
Hello, We have been using Moz Pro for over a year on our site and in the last month noticed that our site is not being crawled anymore. I took a look in Google Search Console, and everything seems to be fine there, so I think it is just the Moz tool that is not working. Has anyone else experienced this? Are there any tips for troubleshooting it? Thank you, Adam
Product Support | | cwells0 -
Reports Issues
Hello there, I recently re-activated my account and I have some issues with the reports. I have been notified by email that the crawl has been successful and data were collected but they refer to January and February instead of November. What should I do? Thanks
Product Support | | PremioOscar0 -
Only sections are showing on PDF report?
Around a quarter of the modules selected are actually showing on a report? Unsure if this is an issue on Moz's part or my own - It has previously been fine and printed the whole report as requested (June was fine) The same report this month is only part compiling. I have also manually created another report (with the same modules) and the exact thing happened. Also - When choosing a campaign no stats show up there any longer, not a big problem but quite annoying all the same. Thanks in advance for any help, Alex
Product Support | | Whittie0 -
Tracking keywords and SERP Analysis Reports
My question is regarding the "Track up to 50 keywords over time." and the "x of 100 full SERP Analysis Reports run this month" Is it possible to increase these fixed numbers? or do I have to use a secondary account? It would be nice to have history for more keywords then 50. and the same goes for the SERP analysis report.
Product Support | | Justian0 -
"Our connection to your Google Analytics account was lost" keeps occurring
We've reauthorized our Google Analytics accounts, but we still keep getting the message "Our connection to your Google Analytics account was lost. Please reauthorize now." This is happening to a handful of our campaigns, not all. Has anyone else encountered this or know what is going on? It says I won't lose any data, but why does this keep happening?
Product Support | | ArtUnlimited0