Finding the source of duplicate content URL's
-
We have a website that displays a number of products. The product has variations (sizes) and unfortunately every size has its own URL (for now anyway). Needless to say, this causes duplicate content issues. (And of course, we are looking to change the URL's for our site as soon as possible)
However, even though these duplicate URL's exist, you should not be able to land on them by navigating through the site. In theory, the site should always display the link to the smallest size. It seems that there is a flaw in our system somewhere, as these links are now found in our campaign here on SEOmoz.
My question: is there any way to find the crawl path that lead to the URL's that shouldn't have been found, so we can locate the problem?
-
Using the Screaming Frog SEO Spider (free version to download will crawl 500 URLs, paid version [99 GBP for a yearly license] will crawl as much as you want), you can see all of the inlinks to a particular page. So run a crawl of the site, you should find those pages with Screaming Frog, and then you can view the inlinks to those pages. Visit the inlinks, and check the code for the links to the page you're looking for - this will quickly show you where the links are to the pages you're trying to hide.
Also, have you checked the sitemap - the CMS might create links to these pages in the sitemap.
good luck and let me know if you need any more help with this.
Mark
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Recovering rankings after a botched url change
Hi there, I have for a long time had a bicycle maintenance website at madegood.org. Over the years the film branch of this business has taken off and moved in a slightly different direction, so I thought in March I decided to move madegood.org to madegoobikes.com, and create a new website for my film business at madegood.com. I thought I did a good job of telling google about my change of domain, but my rankings completely died, so about a month I moved madegoodbikes.com back to madegood.org. So far I haven't seen any sign of a recovery in my rankings, I'm getting almost no visits. I've check all my top pages on OSE and everything seems to be in place. https://moz.com/researchtools/ose/pages?site=http%3A%2F%2Fwww.madegood.org%2F&no_redirects=0&sort=page_authority&filter=all&page=1 Is it normal to wait over a month for my rankings to recover, or is there anything else I should be doing? Any tips/ideas/advice whatsoever will of huge help!
Moz Pro | | madegood0 -
I'm New To Moz What To Focus On First
Hello, Recently signed up to MOZ for the sites we operate in the UK. I wondered what folks would recommend I focus on first when starting to use MOZ for the first time for sites SEO? Cheers Stuart
Moz Pro | | Urban331 -
Error in Moz duplicate content reports
Hi - I've run the Moz campaign on a client's site. Moz is saying that there are duplicate content errors, and when I look at the errors it is showing that they are all to do with the non-www URLs having being duplicated in the www form of the URLs. However this is not the case - all the non-www URLs are all 301 redirected to the www URLs. Is this an error in the Moz tool? Has anybody experienced something similar?
Moz Pro | | rorynatkiel0 -
Duplicate Content
My website is hosted by Hubspot. With each blog I write I can tag them to be listed in a specific category. As an example, one blog article my have three tags or categories that it fits in. Seomoz is seeing this as a duplication of content. in other words, if you go to the different category pages the same article would be listed on all three pages, even though it is just one article. However, I only have 36 duplicate content warnings and I have 150 blog articles, each having 2 or 3 tags (categories.), so there should be many more than 36 duplications. Is this something that affects my seo, or should I just ignore the problem and check these warnings as fixed? Thanks,
Moz Pro | | Rong
Ron0 -
Mozcape API Batching URLs LIMIT
Guys, there's an example to batching URLs using PHP: http://apiwiki.seomoz.org/php Which is the maximum number of URLs I can add to that batch?
Moz Pro | | Srvwiz0 -
How to tell where a competitor's Facebook Shares are coming from?
Is there a way to tell where a competitor's facebook shares and likes are coming from? How to also tell what ads they have running in Facebook and their spend? I am looking at OSE's data and its telling me a competitor has 8.5K Facebook shares and 1K facebook likes. I go to their facebook page and it has about 25 likes. This site should not have anywhere close to the facebook shares/likes its receiving so wondering where they are getting their boosted traffic from.
Moz Pro | | rjb6270 -
How do I delete a url from a keyword campaign
I have a couple of urls that are associated with the keywords in my campaign. They are no longer valid so how do I remove them?
Moz Pro | | PerriCline0 -
Reducing duplicate content
Callcatalog.com is a complaint directory for phone numbers. People post information on the phone calls they get. Since there are many many phone numbers, obviously people haven't posted information on ALL of the phone numbers, THUS I have many phone numbers with zero content. SEOMoz is telling me that pages with zero content looks like duplicate content with each other.. The only difference between two pages that have zero coments is the title and phone number embedded in the page. For example, http://www.callcatalog.com/phones/view/413-563-3263 is a page that has zero comments.. I don't want to remove these zero comment phone number pages from the directory since many people find the pages via a phone number search. Here's my question: what can I do to make google / seomoz think that thexe zero comment pages is not dupliicate content?
Moz Pro | | seo_ploom0