Duplicate content and ways to deal with it.
-
Problem
I queried back a year for the portal and we can see below that the SEO juice is split between the upper and lowercase. You can see the issue in the attached images.
Solutions:
1) Quick: Change the link on the pages above to be lowercase
2) Use canonical link tag http://www.seomoz.org/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps
The tag is part of the HTML header on a web page, the same section you'd find the Title attribute and Meta Description tag. In fact, this tag isn't new, but like nofollow, simply uses a new rel parameter. For example:
http://www.darden.virginia.edu/MBA" />
''This would tell Yahoo!, Live & Google that the page in question should be treated as though it were a copy of the URL http://www.darden.virginia.edu/MBA and that all of the link & content metrics the engines apply should technically flow back to that URL.''
3) See if there is any Google Analytics filters at the site level I can apply. I will check into this and get back to you.
What do you all think??????
-
Because that is just filtering your data in your report. That will not stop this from happening.
-
I think (2) - the canonical tag - is a solid solution if just a few URLs are out of whack, but if you're using the mixed-case version internally, then you may need to change your structure as well. If you change your structure, then I'd probably look at a full-scale system of 301-redirects to preserve inbound link-juice.
It sounds like you're linking to mixed-case internally, so you may need to set up the redirects. Make sure that, depending on your platform, the case-specific redirects work properly (and don't create an endless loop). There is some risk to making the switch, so I'd probably only do it if you're seeing this happen a lot. Unfortunately, mixed-case URLs are often more trouble than they're worth.
-
Why would I not just do this?
http://support.google.com/googleanalytics/bin/answer.py?hl=en&answer=90397
-
I would stick to using the Rel=Canonical tag.
You could also check in Google Webmaster Tools and look at the URL parameter handling tool.
In this you will be able to:
1. Recognize duplicate content on your website.
2.Determine your preferred URLs.
3.Apply 301 permanent redirects where necessary and possible.
4.Implement the rel="canonical" link element on your pages where you can.
5.Use the URL parameter handling tool in Google Webmaster Tools where possible.
Further reading: http://googlewebmastercentral.blogspot.co.uk/2009/10/reunifying-duplicate-content-on-your.htmlI hope this helps
Ally
-
Option "2," using rel=canonical seems like the best course of action to me. You may also want to apply a 301 redirect.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My automated build system is creating a duplicate website
Because of the tools my company is using for CI/CD (A CI/CD pipeline helps you automate steps in your software delivery process, such as initiating code builds, running automated tests, and deploying to a staging or production environment.) an extra URL is generated. The canonical for the generated site is that of our main website, but other than that it is the same website. Could this new URL compete with our website? Will Google count it against us since it is the same content BUT with canonical (it is not noindex-ed)? Does it matter? Surely others are using this method? Answers/thoughts will be greatly appreciated. Thank you.
Reporting & Analytics | | RoxBrock0 -
How many ways to use Event in Google Tag Manager for Event Tracking?
Hello Experts, How many ways to use Event in Google Tag Manager for Event Tracking? As per me there are 5 ways given below are they correct? 2nd thing if yes can you please please tell me procedure of using all or which is the best one to use? HTML 5 Data Attributes Classic Google Analytics Example: _gaq.push(['_trackPageview', '/downloads/pdfs/corporateBrief.pdf']);
Reporting & Analytics | | bkmitesh
3) Universal Analytics Example: ga(‘send’, ‘pageview’, ‘page path’); query string? or is it possible without any coding on website we can configure id's in google tag manager? Thanks! BK Mitesh0 -
Moz Crawler suddenly reporting 1000s of duplicates (BE.net)
In the last 3-4 days we've had several thousand 'duplicate content' warnings appear in our crawl report, 99% of them related to our on-site blog. The blog is BlogEngine.Net, but the pages simply don't exist. The majority seem to be Roger trying quasi-random URLs like:
Reporting & Analytics | | Progauto
/?page=410 /?page=151 Etc. etc. The blog will present content for these requests, but it is of course the same empty page since there's only unique content for up to /?Page=10 or so. Two questions: 1. Did something change recently? These blogs have been up for months, and this problem has only come up this week. Did Roger change to become more aggressive lately? 2. Suggested remediation? On one of the blogs I've put no-index no-follow for any page that has a /?page querystring, and we'll see what effect that has come next crawl next week. However, I'm not sure this will work as per: http://moz.com/community/q/functionality-of-seomoz-crawl-page-reports Anyone else had dynamic blogs suddenly blossom into thousands of duplicate content warnings? Google (rightly) ignores these pages completely.0 -
What is the best way to eliminate this specific image low lying content?
The site in question is www.homeanddesign.com where we are working on recovering from some big traffic loss. I finally have gotten the sites articles properly meta titled and descriptioned now I'm working on removing low lying content. The way there CMS is built, images have their own page (every one that's clickable). So this leads to a lot of thin content that I think needs to be removed from the index. Here is an example: http://www.homeanddesign.com/photodisplay.asp?id=3633 I'm considering the best way to remove it from the index but not disturb how users enjoy the site. What are my options? Here is what I'm thinking: add Disallow: /photodisplay to the robots.txt file See if there is a way to make a lightbox instead of a whole new page for images. But this still leaves me with 100s of pages with just an image on there with backlinks, etc. Add noindex tag to the photodisplay pages
Reporting & Analytics | | williammarlow0 -
Is Google able to determine duplicate content every day/ month?
A while ago I talked to somebody who used to work for MSN a couple of years ago within their engineering department. We talked about a recent dip we had with one of our sites.We argued this could be caused by the large amount of duplicate content we have on this particular website (+80% of our site). Then he said, quoted: "Google seems only to be able to determine every couple of months instead of every day if the content is actually duplicate content". I clearly don't doubt that duplicate content is a ranking factor. But I would like to know you guys opinions about Google being only able to determine this every couple of X months instead of everyday. Have you seen or heard something similar?
Reporting & Analytics | | Martijn_Scheijbeler0 -
Time until duplicate penalty is lifted?
Hello, I recently discovered that half of the pages on my site, about 3,500 were not being indexed or were indexing very very slow and with a heavy weight on them. I discovered the problem in the "HTML Suggestions" within Google's Webmaster Tools. An example of my main issue. All 3 of these URL were showing 200 Status OK in Google. www.getrightmusic.com/mixtape/post/ludacris_1_21_gigawatts_back_to_the_first_time www.getrightmusic.com/mixtape/post/ludacris_1_21_gigawatts_back_to_the_first_time/ www.getrightmusic.com/mixtape/ludacris_1_21_gigawatts_back_to_the_first_time I added some code to the .htaccess in order to remove the trailing slashes across the board. I also properly set up my 404 redirects, which were not properly set up by my developer (when the site "relaunched" 6 months ago 😞 ) I then added the Canonical link rel tags on the site posts/entries. I'm hoping I followed all the correct steps in fixing the issue and now, I guess, I just have to wait until the penalty gets lifted? I'm also not %100 certain that I have been penalized. I'm just assuming based on the SERP ceiling I feel and the super slow or lack of indexing my content. Any insight, help or comments would be super helpful. Thank you. Jesse
Reporting & Analytics | | getrightmusic0 -
Whats the best way to separate Google Shopping from regular organic traffic in analytics?
whats the best way to separate Google Shopping from regular organic traffic in analytics?
Reporting & Analytics | | DavidKonigsberg0 -
Bounce Rates - How would you deal with this scenario?
Greetings! I actually don't have a definitive answer to this so wish to throw it out to the community for thoughts and feedback. I have a client who we shall call "Site 1", but they also have a job board, we shall call "Site 2". A product of their own success, they have a high bounce rate with visitors landing on Site 1, seeing a job they want to apply for and bouncing straight off to Site 2. The problem is that this is resulting in Google seeing some of these pages as having bounce rates of 80% to 100%, based on this formula: Bounce rate = total number of visits viewing only one page / total number of visits Now, I hate anything black hat or grey hat so wish to know how you would deal with this... If the results from Site 2 were displayed in a new framed page on Site 1, would this still be classed as a bounce? If when they click on a job on Site 1, they were taken to an intermediate page on Site 1 saying "Thank you, you are being redirected to your chosen job" for 5 seconds before being taken to Site 2, would this be classed as a bounce? Perhaps the job they wish to apply for 'pulled' from Site 2 and actually displayed in a new page on Site 1 would be a better way to go? I think that option 1 might work, sure that number 3 would but not so sure about number 2, but look forward to your comments and thoughts. Regards, Andy
Reporting & Analytics | | Andy.Drinkwater0