Duplicate content warning: Same page but different urls???
-
Hi guys i have a friend of mine who has a site i noticed once tested with moz that there are 80 duplicate content warnings, for instance
Page 1 is http://yourdigitalfile.com/signing-documents.html
the warning page is http://www.yourdigitalfile.com/signing-documents.html
another example
Page 1 http://www.yourdigitalfile.com/
same second page http://yourdigitalfile.com
i noticed that the whole website is like the nealry every page has another version in a different url?, any ideas why they dev would do this, also the pages that have received the warnings are not redirected to the newer pages you can go to either one???
thanks very much
-
Thanks Tim. Do you have any examples of what those problems might be? With such a large catalog managing those rel canonical tags will be difficult (I don't even know if the store allows them, it's a hosted store solution and little code customization is allowed).
-
Hi there AspenFasteners, in this instance rather than a .HTAccess rule I would suggest applying a rel canonical tag which points to the page you deem as the original master source.
Using the robots to try and hide things could potentially cause you more issues as your categories may struggle to be indexed correctly.
-
We have a similar problem, but much more complex to handle as we have a massive catalog of 80,000 products and growing.
The problem occurs legitimately because our catalog is so large that we offer different navigation paths to the same content.
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8314.htm
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8315.htm
(If you look at the "You are here" breadcrumb trail, you will see the subtle differences in the navigation paths, with 8314.htm, the user went through Home > Screws, with 8315.htm, via Home > Security Fasteners > Screws).
Our hosted web store does not offer us htaccess, so I am thinking of excluding the redundant navigation points via robots.txt.
My question: is there any reason NOT to do this?
-
Oh ok
The only reason i was thinking it is duplicate content is the warnings i got on the moz crawl, see below.
75 Duplicate Page Content
6 4xx Client Error
5 Duplicate Page Title
44 Missing Meta Description Tag
5 Title Element is Too Short
I have found over 80 typos, grammatical errors, punctuation errors and incorrect information which was leading me to believe the quality of the work and their attention to detail was rather bad, which is why i thought this was a possibility.
Thanks again for your time its really appreciated
-
I wouldn't say that they have created two pages, it is just that because you have two versions of the domain and not set a preferred version that you are getting it indexing twice. .HTaccess changes are under the hood of the website and could have simply been an oversight.
-
Hey Tim
Thanks for your answer. It's really weird, other than lazyness on the devs part not to remove old or previous versions of pages?, have you any idea why they would create multiple versions of the same page with different url's?? is there any legit reason like ones severs mobile or something??
Just wondering thanks for replying
-
OK, so in this instance the only issue you have is that you need to choose your preferred start point - www or non www.
I would add a bit of code to your htaccess file to point to your preferred choice. I personally prefer a www. domain. Something like the below would work.
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]As your site is already indexed I would also for the time being and as more of a safety measure add canonicals to the pages that point to the www. version of your site.
Also if you have a Google Search Console account, you can select your prefered domain prefix in there. this will again help with your indexation.
Hopefully I have covered most things.
Cheers
Tim
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Too many links pointing to our privacy policy page: Hurting our ranking efforts of main pages?
Hi community, As per the "Links" report from GSC, there are millions of pages pointing to our privacy policy page. We can expect high number of links to this page being ours an open source product. But these links are overtaking the count of links pointing to our homepage which are very artificial from few spammy or low quality sites. "Privacy policy" anchor text is also been the top anchor text. Our homepage ranking dropped and I suspect this is the culprit. Google might be considering this is the important page being linked on top with anchor text. Shall I Disavow these sites and will this makes Google stop counting links, and the anchor text coming from these sites as well? Suggestions please. Thanks
White Hat / Black Hat SEO | | vtmoz0 -
Page plumetting with a optimisation score of 97\. HELP
Hi everyone, One of my pages has an optimisation score of 93, but ranks in 50+ place. What on earth can I do to address this? It's a course page so I've added the 'course' schema. I've added all the alt tags to say the keyword, UX signals aren't bad. Keyword is in the title tag. It has a meta description. Added an extra 7 internal, anchor-rich links pointing at the page this week. Nothing seems to address it. Any ideas? Cheers, Rhys
White Hat / Black Hat SEO | | SwanseaMedicine1 -
What can I put on a 404 page?
When it comes to SEO what can I put on a 404 page? I want to add content that actually makes the page useful so visitors will more likely stay on the website. Most pages just have a big image of 404 and a couple sentences saying what happened. I am wondering if Google would like if there was blog suggestions or navigational functions?
White Hat / Black Hat SEO | | JoeyGedgaud0 -
Noindexed Pages with External Links Pointing to it: Does the juice still pass through?
I have a site with many many pages that have very thin content, yet they are useful for users/visitors. Those pages also have many external links pointing to them from reputable and authoritative websites. If i were to noindex/follow these pages, will the juice/value from the external links still pass through just as if the page didn't have the noindex tag? Please let me know!
White Hat / Black Hat SEO | | juicyresults0 -
Does Trade Mark in URL matter to Google
Hello community! We are planning to clean up TM and R in the URLs on the website. Google has indexed these pages but some TM pages are have " " " instead displaying in URL from SERP. What's your thoughts on a "spring cleaning" effort to remove all TM and R and other unsafe characters in URLs? Will this impact indexed pages and ranking etc? Thank you! b.dig
White Hat / Black Hat SEO | | b.digi0 -
How does Google decide what content is "similar" or "duplicate"?
Hello all, I have a massive duplicate content issue at the moment with a load of old employer detail pages on my site. We have 18,000 pages that look like this: http://www.eteach.com/Employer.aspx?EmpNo=26626 http://www.eteach.com/Employer.aspx?EmpNo=36986 and Google is classing all of these pages as similar content which may result in a bunch of these pages being de-indexed. Now although they all look rubbish, some of them are ranking on search engines, and looking at the traffic on a couple of these, it's clear that people who find these pages are wanting to find out more information on the school (because everyone seems to click on the local information tab on the page). So I don't want to just get rid of all these pages, I want to add content to them. But my question is... If I were to make up say 5 templates of generic content with different fields being replaced with the schools name, location, headteachers name so that they vary with other pages, will this be enough for Google to realise that they are not similar pages and will no longer class them as duplicate pages? e.g. [School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards. Something like that... Anyone know if Google would slap me if I did that across 18,000 pages (with 4 other templates to choose from)?
White Hat / Black Hat SEO | | Eteach_Marketing0 -
Website Vulnerability Leading to Doorway Page Spam. Need Help.
Keywords he is ranking for , houston dwi lawyer, houston dwi attorney and etc.. Client was acquired in June and since then we have done nothing but build high quality links to the website. None of our clients were dropped/dinged or impacted by the panda/penguin updates in 2012 or updates previously published via Google. Which proves we do quality SEO work. We went ahead and started duplicating links which worked for other legal clients and 5 months later this client is either dropping or staying in local maps results and we are performing very badly in organic results. Some more history..... When he first engaged our company we switched his website from a CMS called plone to word press. During our move I ran some searches to figure out which pages we needed to 301 and we came across many profile pages or member pages created on the clients CMS (PLONE). These pages were very spammy and linked to other plone sites using car model,make,year type keywords (ex:jeep cherokee dealerships). I went through these sites to see if they were linking back and could not find any back links to my clients website. Obviously nobody authorized these pages, they all looked very hackish and it seemed as though there was a vulnerability on his plone CMS installation which nobody caught. Fast forward 5 months and the newest OSE update is showing me a good 50+ back links with unrelated anchor text back links. These anchor text links are the same color as the background and can only be found if you hover your mouse over certain areas of the site. All of these sites are built on Plone and allot of them are linked to other businesses or community websites. These websites obviously have no clue they have been hacked or are being used for black hat purposes. There are dozens of unrelated anchor text links being used on external websites which are pointing back to our clients website. Examples: <a class="clickable title link-pivot" title="See top linking pages that use this anchor text">autex Isuzu, </a><a class="clickable title link-pivot" title="See top linking pages that use this anchor text">Toyota service department ratings, </a><a class="clickable title link-pivot" style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;" title="See top linking pages that use this anchor text">die cast BMW and etc..</a> Obviously the first step is to use the disavow link tool, which will be completed this week. The second step is to take some feedback from the SEO community. It seems like these pages are automatically created using some type of bot. It will be very tedious if we have to continually remove these links. I hope there is a way to notify Google that these websites are all plone and have a vulnerability, which black hats are using to harm the innocent... If i cannot get Google to handle this, then the only other option is to start fresh with a new domain name. What would you do in this situation. Your help is greatly appreciated. Thank you
White Hat / Black Hat SEO | | waqid0