Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Magento Dublicate Content (Noindex and Rel"canonical")
-
Hi All,
Just looking for some advice regarding my website on magento.
We by mistake didnt enable canonical tags and noindex tags so had a big problem with dublicate content from filter pages but also have URLs to Cats as Yes so this didnt help with not having canonical tags enabled.
We now have everything enabled for a few weeks now but dont see much drop in indexed pages in google. (currently 27k and we have only 5k products)
My question basically is how do we speed up noindexation of dublicate content and also would you change URL to cats as No so google just now sees the url to products? (my concerns with this is would leaving it to Yes help because it will hopefully read the canonical tags on products now)
Thank you in advance
Michael
-
Hi Carson
Thank you for replying and the indepth answers.
I did read somewhere that dublicate content on your own website isnt too bad but im glad you have helped me clear things up.
So would you change cat urls to no or leave them to yes for now till google can see all the canoical tags on products?
Thanks
Mike
-
I think there's an underlying assumption here that duplicate content will harm your site, and that's not necessarily true. There's no "duplicate content penalty" - it's more than a filter. Google is better than most at recognizing this, especially with common CMS like Magento and WP. Google attempts to look at the links going to both pages and understand their authority together.
Duplicate content is more of an issue if you're pulling content that others are using as well, e.g. on product descriptions provided by manufacturers and other types of content. Google won't "penalize" you, but they will sometimes filter your site out in favor of the most authoritative site with that content. It's also an issue (mostly for Panda) if you're creating keyword pages that contain duplicate of even very-similar content just to rank for a bunch of very similar keywords.
So my first bit of advice is, "don't obsess over intra-site duplicate content."
That said, it's best to reduce and avoid duplicate content 1) for less-sophisticated search engine, 2) for the sake of your own analytics data integrity and simplicity, 3) just in case Google doesn't get it (very rare).
Set the categories up however you think is best for the user (generally just the product name without categories), double-check the canonical URLs, and wait for Google to catch up on the canonical and noindex. It can take many months depending on your site's authority, but it's unlikely to move the needle either way. Keep in mind that Google may keep pages in the index even if they are honoring the canonical tag - they'll just show the canonical version but keep both indexed. That's working as intended - don't worry about that
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the difference between "Referring Pages" and "Total Backlinks" [on Ahrefs]?
I always thought they were essentially the same thing myself but appears there may be a difference? Any one care to help me out? Cheers!
Technical SEO | | Webrevolve0 -
"Fourth-level" subdomains. Any negative impact compared with regular "third-level" subdomains?
Hey moz New client has a site that uses: subdomains ("third-level" stuff like location.business.com) and; "fourth-level" subdomains (location.parent.business.com) Are these fourth-level addresses at risk of being treated differently than the other subdomains? Screaming Frog, for example, doesn't return these fourth-level addresses when doing a crawl for business.com except in the External tab. But maybe I'm just configuring the crawls incorrectly. These addresses rank, but I'm worried that we're losing some link juice along the way. Any thoughts would be appreciated!
Technical SEO | | jamesm5i0 -
How Does Google's "index" find the location of pages in the "page directory" to return?
This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched.  These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory".  The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls.  Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.
Technical SEO | | reidsteven750 -
Rel="Follow"? What the &#@? does that mean?
I've written a guest blog post for a site. In the link back to my site they've put a rel="follow" attribute. Is that valid HTML? I've Googled it but the answers are inconclusive, to say the least.
Technical SEO | | Jeepster0 -
"nofollow pages" or "duplicate content"?
We have a huge site with lots of geographical-pages in this structure: domain.com/country/resort/hotel domain.com/country/resort/hotel/facts domain.com/country/resort/hotel/images domain.com/country/resort/hotel/excursions domain.com/country/resort/hotel/maps domain.com/country/resort/hotel/car-rental Problem is that the text on ie. /excursions is often exactly the same on .../alcudia/hotel-sea-club/excursion and .../alcudia/hotel-beach-club/excursion The two hotels offer the same excursions, and the intro text on the pages are the exact same throughout the entire site. This is also a problem on the /images and /car-rental pages. I think in most cases the only difference on these pages is the Title, description and H1. These pages do not attract a lot of visits through search-engines. But to avoid them being flagged as duplicate content (we have more than 4000 of these pages - /excursions, /maps, /car-rental, /images), do i add a nofollow-tag to these, do i block them in robots.txt or should i just leave them and live with them being flagged as duplicate content? Im waiting for our web-team to add a function to insert a geographical-name in the text, so i could add ie #HOTELNAME# in the text and thereby avoiding the duplicate text. Right now we have intros like: When you visit the hotel ... instead of: When you visit Alcudia Sea Club But untill the web-team has fixed these GEO-tags, what should i do? What would you do and why?
Technical SEO | | alsvik0 -
How is a dash or "-" handled by Google search?
I am targeting the keyword AK-47 and it the  variants in search (AK47, AK-47, AK 47) .  How should I handle on page SEO?  Right now I have AK47 and AK-47 incorporated. So my questions is really do I need to account for the space or is Google handling a dash as a space? At a quick glance of the top 10 it seems the dash is handled as a space, but I just wanted to get a conformation from people much smarter then I at seomoz. Thanks, Jason
Technical SEO | | idiHost0 -
Should we use "and" or "&"?
Our client has an ampersand in their brand name. The logo has "&", their url is spelled out. I'm trying to get them to standardize the use of the name for directories/listings. Should we use "and" or "&"?
Technical SEO | | vernonmack0 -
301 for "index.php" in Web.config?
Hi there, I'm trying to create a 301 redirect for the file "index.php" but I keep getting a "fail to redirect" message in Firefox whenever I insert it into the Web.config file. <location path="index.php"></location> Is there anyway around this? Thanks for any help According to Open Site Explorer, there are about 500 links to my index file but it only has a 302 status so will not be passing link juice.
Technical SEO | | tdsnet0