Crawl Errors and Duplicate Content
-
SEOmoz's crawl tool is telling me that I have duplicate content at "www.mydomain.com/pricing" and at "www.mydomain.com/pricing.aspx". Do you think this is just a glitch in the crawl tool (because obviously these two URL's are the same page rather than two separate ones) or do you think this is actually an error I need to worry about? Is so, how do I fix it?
-
There are two aspects to the issue:
1. Resolve the cause of the problem. Crawl your site, locate any links to alternate URLs and change the links on your site to use the correct version of the URL.
2. Add a 301 redirect from /bad-url to /good-url. This will ensure any link juice to the bad urls is retained, along with providing a good user experience.
-
You would add the canonical tag to your existing page.
You need to decide how you wish your page to be listed.
Those are two different URLs. They COULD lead to two different pages, but you are choosing to have them lead to the same page, which is a very standard practice. You need to let search engines know how you want the page to be listed. The URL without the .aspx extension is the friendlier URL. I would suggest choosing that one but it is up to you.
-
What page should I add the canonical tag to. From my research a canonical tag in function operates in a similar way to the 301 redirect. However I only have one content page even though it has two URL's. Do I need to create literally two different versions of this content and put the canonical tag on the unwanted page?
-
Hmm, I don't understand this. If a server can detect that these two URLs are the same why can't Google's billion dollar algorithm detect that these are the same?
-
This is not a glitch in the crawl tool. It is something that needs to be fixed.
As Cody suggested, search engines will not understand which URL is correct and any link credit can wind up being split.
Adding a canonical tag to your page will resolve the issue.
I would also examine the crawl report and look at the Referrer to determine if you have any links to the undesired page.
-
The thing is, in the eyes of a crawler, they are different pages, just like http://domain.com and http://www.domain.com are the same page, but the crawlers will see them as different pages.
Are you using URL rewriting to get rid of the extension? If so I could see where this might cause a canonicalization issue if you don't tell the search engines which page you want to be ranked by using rel=canonical or redirecting from pricing.aspx to pricing.
Try using OSE on pricing and then pricing.html and see if you get different statistics.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
520 Error from crawl report with Cloudflare
I am getting a lot of 520 Server Error in crawl reports. I see this is related to Cloudflare. We know 520 is Cloudflare so maybe the Moz team can change this from "unknown" to "Cloudflare 520". Perhaps the Moz team can update the "how to fix" section in the reporting, if they have some possible suggestions on how to avoid seeing these in the report of if there is a real issue that needs to be addressed. At this point I don't know. There must be a solution that Moz can provide like a setting in Cloudflare that will permit the Rogerbot if Cloudflare is blocking it because it does not like its behavior or something. It could be that Rogerbot is crawling my site on a bad day or at a time when we were deploying a massive site change. If I know when my site will be down can I pause Rogerbot? I found this https://developers.cloudflare.com/support/troubleshooting/general-troubleshooting/troubleshooting-crawl-errors/
Technical SEO | | awilliams_kingston0 -
Duplicate content or an update ???
Buying Guide and Product Category page competing for the same keyword ? Got a “nuts and bold website” selling basic stuff. Imagine selling simple nuts, bolts and washers (the little ring that goes in between) in different metals. Imagine a website with a very wide and deep line of these simple products. For long tail keywords we rank well (Example: 0.25 inch bolts). For the keyword: “Nuts bolts” our main category page use to rank well low 1<sup>st</sup> page to second page up against the big guys (Amazon, Walmart, Target, Costco, some drug store who may have a mix pack of nuts and bolts, but still Google don’t see the difference and list 2 pages each for these guys). But then in mid-February there were an update and suddenly our “Buying guide for nuts and bolts” rank higher and started to compete with our own product category page. That was never our intention. These two pages now compete for the ranking on page 4<sup>th</sup>. Clearly there were more words on the buying guide page but no changes had been made to it for well months or years. To make up for it some more words were added to the category page, but of cause there is only so many way you can fraise words about “nuts and bolts” without sounding a bit duplicate/re-writing. So what do I do now ?? Clearly the product category page is the one we like to rank highest with the guide a close 2nd. Most customer don’t need the buying guide but it is good to have and great support as we got lot of good comments from customer who read it. Made a link to the buying guide from the category page and wise verses. The category page got an embedded video. Moz list the page authority for the category page to 16 and 1 for the buying guide but clearly G see it differently. Already tried to change the Meta Tag Title and Description a little but it is hard to do if the word “Nuts Bolts” is to appear in the description or people don’t know what to expect. Could just insert a “do not index” for the buying guide but not a good long term solution. Unfortunately I am out of imagination at this point. Any good suggestions ?? Thanks, Kim Any good suggestions ???
Technical SEO | | KimX0 -
When Should I Ignore the Error Crawl Report
I have a handful of pages listed in the Error Crawl Report, but the report isn't actually showing anything wrong with these pages. I am double checking the code on the site and also can't find anything. Should I just move on and ignore the Error Crawl Report for these few pages?
Technical SEO | | ChristinaRadisic0 -
Link Structure & Duplicate Content
I am struggling with how I should handle the link structure on my site. Right now most of my pages are like this: Home -> Department -> Service Groups -> Content Page For Example: Home -> IT Solutions -> IT Support & Managed Services -> IT Support Home -> IT Solutions -> IT Support & Managed Services -> Managed Services Home -> IT Solutions -> IT Support & Managed Services -> Help Desk Services Home -> IT Solutions -> Virtualization & Data Center Solutions -> Virtualization Home -> IT Solutions -> Virtualization & Data Center Solutions -> Data Center Solutions This structure lines up with our business and makes logical sense but I am not sure how to handle the department and service group pages. Right now you can click them and it just brings you to a page with a small snippet for the links below. The real content is on the content pages. What I am worried about is that the snippets on those pages are just a paragraph or two of the content that's on the content page. Will this hurt me and get considered duplicate content? What is the best practice for dealing with this? Those department/service group pages have some good content on them but it's just parts of other pages. Am I okay doing this because there are not direct duplicates of other pages just parts of a few pages? Any help on this would be great. Thanks in advance.
Technical SEO | | ZiaTG0 -
Duplicate Page Titles and Content
I have a site that has a lot of contact modules. So basically each section/page has a contact person and when you click the contact button it brings up a new window with form to submit and then ends with a thank you page. All of the contact and thank you pages are showing up as duplicate page titles and content. Is this something that needs to be fixed even if I am not using them to target keywords?
Technical SEO | | AlightAnalytics0 -
Tracking a Crawl error
Hi All, If you find a crawl error on your page. How do you find it? The error only says the URL that is wrong but this is not the location. Can i drill down and find out more information? Thank you!
Technical SEO | | wedmonds0 -
Duplicate content
Greetings! I have inherited a problem that I am not sure how to fix. The website I am working on had a 302 redirect from its original home url (with all the link juice) to a newly designed page (with no real link juice). When the 302 redirect was removed, a duplicate content problem remained, since the new page had already been indexed by google. What is the best way to handle duplicate content? Thanks!
Technical SEO | | shedontdiet0 -
Duplicate content issues caused by our CMS
Hello fellow mozzers, Our in-house CMS - which is usually good for SEO purposes as it allows all the control over directories, filenames, browser titles etc that prevent unwieldy / meaningless URLs and generic title tags - seems to have got itself into a bit of a tiz when it comes to one of our clients. We have tried solving the problem to no avail, so I thought I'd throw it open and see if anyone has a soultion, or whether it's just a fault in our CMS. Basically, the SEs are indexing two identical pages, one ending with a / and the other ending /index.php, for one of our sites (www.signature-care-homes.co.uk). We have gone through the site and made sure the links all point to just one of these, and have done the same for off-site links, but there is still the duplicate content issue of both versions getting indexed. We also set up an htaccess file to redirect to the chosen version, but to no avail, and we're not sure canonical will work for this issue as / pages should redirect to /index.php anyway - and that's we can't work out. We have set the access file to point to index.php, and that should be what should be happening anyway, but it isn't. Is there an alternative way of telling the SE's to only look at one of these two versions? Also, we are currently rewriting the content and changing the structure - will this change the situation we find ourselves in?
Technical SEO | | themegroup0