Duplicated content detected with MOZ crawl with canonical applied
-
Hi there!
I have a slight problem.
I have a site with Joomla 3.3 that we recently migrated from 2.5.Joomla, for some reason that I don´t really get, creates hundreds of weird urls for the site like
mydomain.com/en -> joomla creates en/home/149-xxx-xxx/xxxxxx-xxxxxx that links to the first one.
The new version 3.3 knows this bug and applies a rel=canonical to the ones created "artificially", so they should not be identified as duplicated.Sample piece of code: en/home/149-all-en/xxxxxxx-xxxxxx" rel="canonical" /
MOZ crawler identifies this as duplicated and like this I have thousands of pages duplicated all with titles, content etc... all the ones created by joomla. Still my site has good SEO results and I can not see any penalties but I am a bit concerned they may come in the future....
Can anyone explain me what is happening?
Thank you in advance for your time,
-
If it's a period of 2 weeks and you're going to do it anyways, I would just make the new content and not go to the expense of setting up redirects and then taking them down, which can cause issues when you plan on recreating a URL.
-
Thank you for your time!
We are going to setup 301 redirects (one colleague suggested importing those directly in the DB of redirects) from those duplicated pages until joomla has a native solution and we have the time to make all unique content, to avoid penalties.
At least, we would solve temporaly the problem, it will take 2 weeks to make all the unique content.
Would that make sense?
Have a nice weekend!
-
I personally would not generate new language sections unless the content has been translated and localized on those pages. Right now your Spanish homepage has English content in the body, so I would view this as incomplete. Ideally you'd translate the entire page for those sections.
When you do that, you'll want to use hreflang, not canonicals, to indicate different versions of the same content.
So, my recommendation is (A) get rid of the Spanish content sections which would solve the duplication problem, or (B) finish translating the content and then install hreflang code, which would also solve the duplication problem.
Unfortunately I don't know of a good hreflang tool for Joomla specifically.
Let me know if that makes sense?
-
Thank you Kane.
I would like to keep the content in all the languages, ,as I think it is useful for customers to enter easily certain areas.
The problem that I am always having is the implementation...There are not real good canonical plugins (that would allow me to do a bulk import), and I am not that advanced as for doing an htaccess redirect with 301... still, I would like that if someone from NL or FI version would like to find the area barcelona could see it....
Anything on mind!? Just to say, I tried SH404, does all the work but rewrites the whole url structure (not possible), I tried canonical http://www.cmsplugin.com/products/components/4-canonical-url which solves the duplication by languages but not the random urls created by 3.3...
Then I decided to leave the plugin I mentioned before, it deletes all the duplicated urls generated automatically but does not solve the language problem...So, here I am
Any suggestion?
-
Also, if you decide to keep the /es/ section of the website then you'll need to look into hreflang instead of canonical tags, because /es/ and /en/ will not be duplicate content once they're translated.
Read this Q&A from Google for details - https://sites.google.com/site/webmasterhelpforum/en/faq-internationalisation#q20
-
Hey Jose,
If you have an /es/ subfolder then ideally you would be translating that content to Spanish, not canonicalizing that content back to the English version.
I can see from http://www.spain-internship.com/es/internships-in-salamanca that not all /es/ pages are translated - is this true across the entire website?
If you don't have any Spanish content, then you should just kill off the /es/ version entirely.
-
Hi there,
Thanks for the update. Now that you told me the problem I found out this is a known bug for joomla and I am working on it.
I found a plugin http://styleware.eu/store/item/26-styleware-content-canonical-plugin that sends all the duplicated urls, generated automatically with a canonical to the home.Sample:
http://www.spain-internship.com/en/home/149-all-en/placement-spain
Now with the link http://www.spain-internship.com" rel="canonical" />.This solves the problem of the core canonical bug.
Would this be a proper solution?Now I only have to change all the ones duplicated due to languages config, block then in robots or canonical but as far as I control it, it is ok.
Please, let me know if this would be a proper solution.
Thank you in advance for your help, if I can help you in some moment with something here we are!
-
Ok, the problem is your pages are all canonical to themselves, the canonical tag should point at the main page for the content, not to every page. For your first example, all pages that get their content from http://www.spain-internship.com/en need to have canonical tags to that page, instead the copy page has this:
href="http://www.spain-internship.com/fi/etusivu/186-all-fi/home-page-fi" rel="canonical" />
it should have
href="http://www.spain-internship.com/fi/" rel="canonical" />
-
I will provide few so you can look!
Detected as duplicated:
http://www.spain-internship.com/en
http://www.spain-internship.com/en/home/149-all-en/placement-spainSame here:
http://www.spain-internship.com/fi
http://www.spain-internship.com/fi/etusivu/186-all-fi/home-page-fihttp://www.spain-internship.com/en/internships-in-salamanca
http://www.spain-internship.com/es/internships-in-salamancaFirst one is the original. The rest one have canonical. Still detected as duplicated.
-
Do you have an example of one of these generated pages as well, everything looks fine on the main page.
-
Hey,
Yes, sure.
This is the duplicated from the /en
http://www.spain-internship.com/en/home/149-all-en/placement-spain
Thanks!
-
Do you have a link to one of these pages so we can look at how it is deploying the canonical onto the page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Looking for someone from Moz to comment on unrealistic spam score
Two years ago I bought domain name aroundtheworldwithme.com as a travel blog. I built the site up slowly and currently have a DA of 28 with decent Google search results. However, according to Moz my spam score is 43%. I am convinced that something funny is going on to give me this spam score. I have gone though all 27 factors that play into the spam score in great detail. I only fail a few of the checks. These are the Double Click Tag, LinkedIn profile, phone number, email, and Facebook Pixel. Which as far as I know, literally zero travel blog websites provide this info. So I am on par with evert other travel blog website. Now I know Moz will say this "doesn't mean your website is spam, just that our algorithm found that websites with similar attributes are spam." But this is completely bogus. All similar websites to mine have spam scores of 1-2%. All other websites I see with spam scores over 40% are literally spam websites. Why am I literally the only legitimate travel blog site that has a spam score over 40%? My backlink profile is similar to all other travel blogs. I actually have less spammy links as most, as I haven't been around too long. So I don't think my backlinks are causing the high spam score. The only thing I can think of is that my domain name used to be owned by someone else. I have a lot of backlinks from a random blog website that were discovered in 2018, three years before I bought the domain. Is it possible that the domain used to be an actual spam site name and I am now being punished for that? If not, then I cannot think of anything that would cause my high spam score other than fundamental defects in the Moz spam score algorithm. Something is going on, and I'd love someone from Moz to actually be able to have a look at my website and tell me why I have such a high spam score. I know Google doesn't care about Moz's spam score (thankfully) but other websites don't want links from me due to my completely bogus spam score. Thanks everyone
Link Explorer | | Heckmantis
aroundtheworldwithme.com0 -
Moz's new Link Explorer, including our revamped index and DA/PA scores is now open to everyone!
Hey Moz Community, Link Explorer is now open to the public! Everyone can access it via a subscription or a free Moz ‘Community’ account. As you may know by now, the brand-new Link Explorer tool is primed to replace Open Site Explorer as Moz’s link building and analysis tool. The Link Explorer project is the result of an incredible amount of perseverance and hard work by the team, and we’re proud to be able to finally share it with you — we know it’s going to revolutionize how you approach link building and make your job easier. You can read more about the tool here in Sarah Bird’s announcement post. Because Link Explorer improves on almost every aspect of Open Site Explorer, the metrics have improved, too. That means you’re likely going to see some Domain Authority and Page Authority discrepancies between OSE’s index and Link Explorer’s index. We definitely suggest you use the new DA/PA from Link Explorer, as they’re more accurate and refresh daily rather than monthly, as was the case with OSE’s index. However, we also realize that many of you use these metrics to report to your clients and colleagues, and a sudden change or fluctuation could potentially make your job harder. Which DA is the real DA? The new DA is based on a much larger index that has many improvements, several of which are designed to make the index more like Google’s than ever before. You should consider moving towards the new DA (and the old DA won’t be updated after April 26th 2018, so the sooner the better). While there will be fluctuations as we improve the model and add features to the index, we expect it to remain largely stable and to be a far more accurate picture of a site’s authority according to how it’s seen by Google. Why is Link Explorer’s DA/PA considered better than OSE’s, and which should I trust? The larger link index with improved crawl selection allows us to produce a stronger model that includes a much larger proportion of the web. That being said, DA and PA should always be considered in the context of your competitors. A drop in PA or DA relative to the old OSE is of little concern if your competitors saw similar movement. Is Domain Authority/Page Authority an absolute score or a relative one? Both DA and PA are relative to the Internet as a whole. If Facebook acquired a billion new links, everyone’s PA and DA would drop relative to Facebook. Because of this, it’s always best to look at PA and DA in comparison to your competitors. What does a drop/raise in DA mean in Link Explorer vs OSE? How can I explain this to my clients when I’m reporting it? DA and PA should always be considered in the context of your competitors. A drop or raise in PA or DA relative to the old OSE is of little concern if your competitors saw similar movement. Reporting that your site has moved from a DA of 45 to a DA of 42 doesn’t tell the whole story, but reporting that your site has a DA of 42 while your main competitor moved from a 43 to a 37 shows that, relative to the sites you’re competing against in the SERPs, your site has significantly more authority and ranking power. What’s happening to MozTrust and MozRank and why, and what should I replace those with? The improvements to our DA/PA and Spam Score metrics now now account for more important nuances in helping you determine one site’s ability to rank higher than another. Because they no longer correlate with Google’s ranking model as well as they used to, MozRank and MozTrust are being deprecated for better metrics. Users should rely on Page Authority, Domain Authority, and Spam Score to determine the importance and quality of pages, domains, and links. I have historical data I use to help my clients benchmark their progress. What do I do now that DA is calculated differently? You should annotate any KPI changes referencing the change in DA and PA. However, most importantly, you should compare those changes to your competitors, as this will best show how strong your site’s authority is relative to the sites you’re competing against in the SERPs. We take updating our metrics very seriously, and our last major update to the model was 7 years ago. Users of Domain Authority and Page Authority can expect us to continue to produce steady, reliable metrics for the long haul, and only make changes to these metrics when we believe the benefits dramatically outweigh the stability of the metric. Do you have any questions about the new metrics? Anticipating a tough time reporting changes to clients or bosses? Metrics, features or functionality missing that you would want to see? Let us know in the thread, and we’ll work to find a good answer for you. Hope you enjoy the new Link Explorer product and the amazing new link index powering it. We are very excited to provide this valuable data to our community and customers.
Link Explorer | | IanWatson9 -
Crawl a node js page - Why can I only see my frontpage?
Hi When i am trying to crawl my website ( https://www.doorot.com/ ) it can only find my frontpage. It's a node js page. Any one had the same problem or know how to crawl my site in order to see all my pages? Kasper
Link Explorer | | KasperClio1 -
A few questions regarding Moz tools + E-commerce strategy
Hi everyone 🙂 I'm currently in the midst of optimizing a Scandinavian E-commerce site. I have a few questions, that hopefully someone will be able to help me get answered. Firstly, GoogleBots should be able to recognize "ø" as "oe", "æ" as "ae" and "å" as "aa" in the URL title. I've noticed that Moz' On-page grader does not support this unfortunately - has something changed or do Scandinavians just receive a little less love than the English? Secondly, how does one avoid keyword stuffing on E-commerce sites? The products that are displayed in category pages all make use of the same keyword that is targeted for that category. As such, some pages have 40+ mentions of the keyword, although in reality there are less than 15 (the rest being in the product names). Any tips or tricks on how to get this optimized or does Google simply recognize the site as an E-commerce site and somewhat ignores keyword stuffing - as long as the website has sufficient content? Thirdly, has something happened to Moz' Open Site Explorer? It seems like something has changed and when I checked for backlinks for the site today, only 3 was found. I know for a fact that many many more exist (which other tools also confirm when they scrape the site). Looking forward to hearing from all of you! Best, Mark
Link Explorer | | osn0 -
Getting "google bloking" in results of Crawl
What is the meaning of this in Excell results of crawling a website: multilingues.eu <colgroup><col width="165"> <col width="149"> <col width="139"></colgroup>
Link Explorer | | FernandoH.Silva
| | | |
| Blocking Google | Blocking Yahoo | Blocking Bing |
| | | |
| 312 | 14 | 187 |
| | | |
| 66 | 1 | 0 |
| | | |
| 46 | 2 | 1 |
| | | |0 -
Why Moz doesn't offer tool similar like SEOprofiler Link Disinfection?
Hi Moz, As I have been your member & using pro tool for more than an year, it's been amazing experience working and using your each tool's function tool, wehther to check on-page grader, rank tracking, on-page crawling issue, keyword difficulty and the one my best for link analysis is Open Site Explorer etc. I came here to ask about something which I didn't find here than other providers is most popular Bad Link Detection Tool. As **SEOprofiler Link Disinfection **offers their customers to use this to get all of their bad or spammy links harming them to rank high or getting away from ranking goal. Let me know, are you planning anything similar to this to bring soon in your tools and provide your members to take benefits of such type function as well OR should i go to SEOprofiler for this? Best,
Link Explorer | | Futura
Teginder Ravi0 -
301 Redirects: How long for Google to recognize? How long for Moz/OSE to recognize?
I redirected a few pages that had funky URL issues (capitals and underscores and useless words) to the same page renamed with proper URLs. The sitemaps were changed also, and re-submitted. Like this: mydomain.com/Not_Pertinent_Words.html >>> mydomain.com/good-words.html Google seems to have found them and changed the search engine results listing in about 8 days. But it's been about a month now and Moz and OSE still have not transferred all the strength and link data from the old URLs to the new ones. Question 1: How long does it usually take Google to transfer all the link and strength data for a 301? Question 2: How long does it take Moz and OSE to do the same? Is there something I need to do to tell them about the changes?
Link Explorer | | GregB1230