Database driven content producing false duplicate content errors
-
How do I stop the Moz crawler from creating false duplicate content errors. I have yet to submit my website to google crawler because I am waiting to fix all my site optimization issues.
Example: contactus.aspx?propid=200, contactus.aspx?propid=201.... these are the same pages but with some old url parameters stuck on them. How do I get Moz and Google not to consider these duplicates. I have looked at http://moz.com/learn/seo/duplicate-content with respect to
Rel="canonical"
and I think I am just confused.
Nick
-
All of you guys rock! I have never been involved in a community that has had the right answers every time... I used the on all my static pages such as directions, policies, contact, etc... and it removed all the parameters thereby eliminating them from standing out in the MOZ crawl.. I feel like and idiot not knowing about this HTML tag and its importance. My moz crawl now looks so so much better.
When I mean old url parameters, I just meant a few seconds old, meaning the user is on property.aspx?property=1 then when they moved to a static page such as contact, directions, policy we now have another page called contact.aspx?property=1 which if I have 150 properties times 10 static pages I basically just created 150 duplicate content errors just for the contact page alone. Because contact.aspx?property=1 or contact.aspx?property=150 and in between are all the same page... I am sure this has killed my SEO. SO THAT PROBLEM IS NOW FIXED!!
NOW to revisit what zenstorageunits says about URL rewriting which has many different ways to do it using .net, but Miketek I would not have to create subdirectories because it is done in the code... they are more like virtual directories...
zenstorageunits or anyone else for that matter, Is it worth it for me to hire somebody to create a URL rewrite app that can change the following;
http:/www.destinationbigbear.com/property_detail.aspx?propid=202 to
http://www.destinationbigbear.com/big-bear-cabin-rentals/a-true-cabin/details
and
http:/www.destinationbigbear.com/property_photos.aspx?propid=202 to
http://www.destinationbigbear.com/big-bear-cabin-rentals/a-true-cabin/photos
See everyone of my 150 cabins has these pages; info, photos, calendar, video, reviews, rates...and they all have unique cabin names... so it is basically 150 cabins x 6 pages = 900 unique pages with unique content but really only 6 pages dynamically being changed by 150 cabins.
I have been able to dynamically change all the page titles for everyone of these 900 database driven pages such as
Big-Bear-Cabin | A True Cabin Photos or Big-Bear-Cabin | A True Cabin Calendar and so on.
-
Hi Nick,
I think you've gotten some good tips here - I'd agree with Prestashop that the preferred solution would be to find where these parameters are being included in links to this page and remove them.
Failing that, zenstorageunits's advice to use rel="canonical" would be my recommendation - or a 301 redirect from the URLs that include parameters back to the core URL would work.
I wouldn't convert these parameters to subdirectories unless they are integral to the way your site works and pull up unique content - you called them "old parameters" so it sounds like they're not supposed to be there, so probably not a case where you'd want to convert these parameters to subdirectories.
Failing the above, you could utilize the Google Webmaster Tools "URL Parameters" interface to tell Googlebot to ignore these parameters.
Overall, your best course of action is to find and remove the links that include the parameters.
I'd also add that the Moz crawl report is highly sensitive to "duplicate content," and I often find it flags up issues has high/medium priority that are not actually going to have a significant impact on the site. You have to take the crawl report with a grain of salt - while duplicate content can be a serious issue for some sites (ecommerce retailers for example with duplication issues across a wide catalog of products), in most cases it has minimal impact and isn't something I'd hold up your site launch for.
Best of Luck,
Mike -
I agree zenstorageunits about using rel=canonical but one thing I would like to point out is that Moz does not create false errors. It is a simple crawler, not like google. Google will actually try to follow links that people have used before and that show up in your analytics files. moz uses no logic like that, it just jumps from page to page. If it is picking up a page with a query string like that then it is a link on your site. I would find the links and take them off.
-
You have a few options you could do. One thing I would look into is maybe doing some url rewriting to change
contactus.aspx?propid=200
to
contactus/propid/200
look at http://msdn.microsoft.com/en-us/library/ms972974.aspx on how to do that for IIS.
A better option I think if you need to keep the parameters the way they are is to use the rel canocial tag look at moz article
http://moz.com/blog/rel-confused-answers-to-your-rel-canonical-questions
but basicly you would need to add something like this to your contact.aspx page(replace example.com with your website url)
This suggest to the website crawler, like google or moz crawler, that those pages should be assoicated with the contact.aspx page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate title/content errors for blog archives
Hi All Would love some help, fairly new at SEO and using SEOMoz, I've looked through the forums and have just managed to confuse myself. I have a customer with a lot of duplicate page title/content errors in SEOMoz. It's an umbraco CMS and a lot of the errors appear to be blog archives and pagination. i.e. http://example.com/blog http://example.com/blog/ http://example.com/blog/?page=1 http://example.com/blog/?page=2 and then also http://example.com/blog/2011/08 http://example.com/blog/2011/08?page=1 http://example.com/blog/2011/08?page=2 http://example.com/blog/2011/08?page=3 (empty page) http://example.com/blog/2011/08?page=4 (empty page) This continues for different years and months and blog entries and creates hundreds of errors. What's the best way to handle this for the SEOMoz report and the search engines. Should I rel=canonical the /blog page? I think this would probably affect the SEO of all the blog entries? Use robots.txt? Sitemaps? URL parameters in the search engines? Appreciate any assistance/recommendations Thanks in advance Ian
Technical SEO | | iragless0 -
Container Page/Content Page Duplicate Content
My client has a container page on their website, they are using SiteFinity, so it is called a "group page", in which individual pages appear and can be scrolled through. When link are followed, they first lead to the group page URL, in which the first content page is shown. However, when navigating through the content pages, the URL changes. When navigating BACK to the first content page, the URL is that for the content page, but it appears to indexers as a duplicate of the group page, that is, the URL that appeared when first linking to the group page. The client updates this on the regular, so I need to find a solution that will allow them to add more pages, the new one always becoming the top page, without requiring extra coding. For instance, I had considered integrating REL=NEXT and REL=PREV, but they aren't going to keep that up to date.
Technical SEO | | SpokeHQ1 -
Duplicate content - wordpress image attachement
I have run my seomoz campaign through my wordpress site and found duplicate content. However, all of this duplicate content was either my logo or images and no content with addresses like /?attachement_id=4 for example . How should I resolve this? thank you.
Technical SEO | | htmanage0 -
Duplicate content, Original source?
Hi there, say i have two websites with identicle content. website a had content on before website b - so will be seen as the original source? If the content was intended for website b, would taking it off a then make the orinal source to google then go to website b? I want website b to get the value of the content but it was put on website a first - would taking it off website a then give website b the full power of the content? Any help of advice much appreciated. Kind Regards,
Technical SEO | | pauledwards0 -
Strange duplicate content issue
Hi there, SEOmoz crawler has identified a set of duplicate content that we are struggling to resolve. For example, the crawler picked up that this page www. creative - choices.co.uk/industry-insight/article/Advice-for-a-freelance-career is a duplicate of this page www. creative - choices.co.uk/develop-your-career/article/Advice-for-a-freelance-career. The latter page's content is the original and can be found in the CMS admin area whilst the former page is the duplicate and has no entry in the CMS. So we don't know where to begin if the "duplicate" page doesn't exist in the CMS. The crawler states that this page www. creative-choices.co.uk/industry-insight/inside/creative-writing is the referrer page. Looking at it, only the original page's link is showing on the referrer page, so how did the crawler get to the duplicate page?
Technical SEO | | CreativeChoices0 -
Duplicate content issues caused by our CMS
Hello fellow mozzers, Our in-house CMS - which is usually good for SEO purposes as it allows all the control over directories, filenames, browser titles etc that prevent unwieldy / meaningless URLs and generic title tags - seems to have got itself into a bit of a tiz when it comes to one of our clients. We have tried solving the problem to no avail, so I thought I'd throw it open and see if anyone has a soultion, or whether it's just a fault in our CMS. Basically, the SEs are indexing two identical pages, one ending with a / and the other ending /index.php, for one of our sites (www.signature-care-homes.co.uk). We have gone through the site and made sure the links all point to just one of these, and have done the same for off-site links, but there is still the duplicate content issue of both versions getting indexed. We also set up an htaccess file to redirect to the chosen version, but to no avail, and we're not sure canonical will work for this issue as / pages should redirect to /index.php anyway - and that's we can't work out. We have set the access file to point to index.php, and that should be what should be happening anyway, but it isn't. Is there an alternative way of telling the SE's to only look at one of these two versions? Also, we are currently rewriting the content and changing the structure - will this change the situation we find ourselves in?
Technical SEO | | themegroup0 -
Canonical Link for Duplicate Content
A client of ours uses some unique keyword tracking for their landing pages where they append certain metrics in a query string, and pulls that information out dynamically to learn more about their traffic (kind of like Google's UTM tracking). Non-the-less these query strings are now being indexed as separate pages in Google and Yahoo and are being flagged as duplicate content/title tags by the SEOmoz tools. For example: Base Page: www.domain.com/page.html
Technical SEO | | kchandler
Tracking: www.domain.com/page.html?keyword=keyword#source=source Now both of these are being indexed even though it is only one page. So i suggested placing an canonical link tag in the header point back to the base page to start discrediting the tracking URLs: But this means that the base pages will be pointing to themselves as well, would that be an issue? Is their a better way to solve this issue without removing the query tracking all togther? Thanks - Kyle Chandler0 -
Duplicate content
This is just a quickie: On one of my campaigns in SEOmoz I have 151 duplicate page content issues! Ouch! On analysis the site in question has duplicated every URL with "en" e.g http://www.domainname.com/en/Fashion/Mulberry/SpringSummer-2010/ http://www.domainname.com/Fashion/Mulberry/SpringSummer-2010/ Personally my thoughts are that are rel = canonical will sort this issue, but before I ask our dev team to add this, and get various excuses why they can't I wanted to double check i am correct in my thinking? Thanks in advance for your time
Technical SEO | | Yozzer0