Big problem with duplicate page content
-
Hello!
I am a beginner SEO specialist and a have a problem with duplicate pages content. The site I'm working on is an online shop made with Prestashop. The moz crawl report shows me that I have over 4000 duplicate page content. Two weeks ago I had 1400.
The majority of links that show duplicate content looks like bellow:
http://www.sitename.com/category-name/filter1
http://www.sitename.com/category-name/filter1/filter2Firstly, I thought that the filtres don't work. But, when I browse the site and I test it, I see that the filters are working and generate links like bellow:
http://www.sitename.com/category-name#/filter1
http://www.sitename.com/category-name#/filter1/filter2The links without the # do not work; it messes up with the filters.
Why are the pages indexed without the #, thus generating me duplicate content?
How can I fix the issues?
Thank you very much! -
Thank you!
-
Check the prestashop forums, and see if another user has a custom htaccess rule you can put in place. Most likely this is a simple error that can be fixed using url rewriting. If you cannot set the items to redirect for any reason, set the offending pages to no-index in your robots.txt file.
-
Hi Ana
Based on my personal experience with CMS such as Prestashop, they are using URL re-write methods in the .httaccess file. If they are not set up correctly a crawler such as Moz Bot will end up at the wrong url, thus indexing it. Since you can navigate the site and get the correct URLS through clicking, my first thought would be to look on the Prestashop forums to see if there is any updates or advice on fixing the navigation.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 redirect to avoid duplicate content penalty
I have two websites with identical content. Haya and ethnic Both websites have similar products. I would like to get rid of ethniccode I have already started to de-index ethniccode. My question is, Will I get any SEO benefit or Will it be harmful if I 301 direct the below only URL’s https://www.ethniccode/salwar-kameez -> https://www.hayacreations/collections/salwar-kameez https://www.ethniccode/salwar-kameez/anarkali-suits - > https://www.hayacreations/collections/anarkali-suits
Intermediate & Advanced SEO | | riyaaaz0 -
Woocommerce SEO & Duplicate content?
Hi Moz fellows, I'm new to Woocommerce and couldn't find help on Google about certain SEO-related things. All my past projects were simple 5 pages websites + a blog, so I would just no-index categories, tags and archives to eliminate duplicate content errors. But with Woocommerce Product categories and tags, I've noticed that many e-Commerce websites with a high domain authority actually rank for certain keywords just by having their category/tags indexed. For example keyword 'hippie clothes' = etsy.com/category/hippie-clothes (fictional example) The problem is that if I have 100 products and 10 categories & tags on my site it creates THOUSANDS of duplicate content errors, but If I 'non index' categories and tags they will never rank well once my domain authority rises... Anyone has experience/comments about this? I use SEO by Yoast plugin. Your help is greatly appreciated! Thank you in advance. -Marc
Intermediate & Advanced SEO | | marcandre1 -
Is it a problem to use a 301 redirect to a 404 error page, instead of serving directly a 404 page?
We are building URLs dynamically with apache rewrite.
Intermediate & Advanced SEO | | lcourse
When we detect that an URL is matching some valid patterns, we serve a script which then may detect that the combination of parameters in the URL does not exist. If this happens we produce a 301 redirect to another URL which serves a 404 error page, So my doubt is the following: Do I have to worry about not serving directly an 404, but redirecting (301) to a 404 page? Will this lead to the erroneous original URL staying longer in the google index than if I would serve directly a 404? Some context. It is a site with about 200.000 web pages and we have currently 90.000 404 errors reported in webmaster tools (even though only 600 detected last month).0 -
I have search result pages that are completely different showing up as duplicate content.
I have numerous instances of this same issue in our Crawl Report. We have pages showing up on the report as duplicate content - they are product search result pages for completely different cruise products showing up as duplicate content. Here's an example of 2 pages that appear as duplicate : http://www.shopforcruises.com/carnival+cruise+lines/carnival+glory/2013-09-01/2013-09-30 http://www.shopforcruises.com/royal+caribbean+international/liberty+of+the+seas We've used Html 5 semantic markup to properly identify our Navigation <nav>, our search widget as an <aside>(it has a large amount of page code associated with it). We're using different meta descriptions, different title tags, even microformatting is done on these pages so our rich data shows up in google search. (rich snippet example - http://www.google.com/#hl=en&output=search&sclient=psy-ab&q=http:%2F%2Fwww.shopforcruises.com%2Froyal%2Bcaribbean%2Binternational%2Fliberty%2Bof%2Bthe%2Bseas&oq=http:%2F%2Fwww.shopforcruises.com%2Froyal%2Bcaribbean%2Binternational%2Fliberty%2Bof%2Bthe%2Bseas&gs_l=hp.3...1102.1102.0.1601.1.1.0.0.0.0.142.142.0j1.1.0...0.0...1c.1.7.psy-ab.gvI6vhnx8fk&pbx=1&bav=on.2,or.r_qf.&bvm=bv.44442042,d.eWU&fp=a03ba540ff93b9f5&biw=1680&bih=925 ) How is this distinctly different content showing as duplicate? Is SeoMoz's site crawl flawed (or just limited) and it's not understanding that my pages are not dupe? Copyscape does not identify these pages as dupe. Should we take these crawl results more seriously than copyscape? What action do you suggest we take? </aside> </nav>
Intermediate & Advanced SEO | | JMFieldMarketing0 -
What constitutes duplicate content?
I have a website that lists various events. There is one particular event at a local swimming pool that occurs every few months -- for example, once in December 2011 and again in March 2012. It will probably happen again sometime in the future too. Each event has its own 'event' page, which includes a description of the event and other details. In the example above the only thing that changes is the date of the event, which is in an H2 tag. I'm getting this as an error in SEO Moz Pro as duplicate content. I could combine these pages, since the vast majority of the content is duplicate, but this will be a lot of work. Any suggestions on a strategy for handling this problem?
Intermediate & Advanced SEO | | ChatterBlock0 -
Duplicate Content Help
seomoz tool gives me back duplicate content on both these URL's http://www.mydomain.com/football-teams/ http://www.mydomain.com/football-teams/index.php I want to use http://www.mydomain.com/football-teams/ as this just look nice & clean. What would be best practice to fix this issue? Kind Regards Eddie
Intermediate & Advanced SEO | | Paul780 -
Duplicate content on index.htm page
How do I avoid duplicate content on the index.htm page . I need to redirect the spider from the /index.htm file to the main root of http://www.manandhisvan.com.au and hence avoid duplicate content. Does anyone know of a foolproof way of achieving this without me buggering up the complete site Cheers Freddy
Intermediate & Advanced SEO | | Fatfreddy0 -
Subdomains - duplicate content - robots.txt
Our corporate site provides MLS data to users, with the end goal of generating leads. Each registered lead is assigned to an agent, essentially in a round robin fashion. However we also give each agent a domain of their choosing that points to our corporate website. The domain can be whatever they want, but upon loading it is immediately directed to a subdomain. For example, www.agentsmith.com would be redirected to agentsmith.corporatedomain.com. Finally, any leads generated from agentsmith.easystreetrealty-indy.com are always assigned to Agent Smith instead of the agent pool (by parsing the current host name). In order to avoid being penalized for duplicate content, any page that is viewed on one of the agent subdomains always has a canonical link pointing to the corporate host name (www.corporatedomain.com). The only content difference between our corporate site and an agent subdomain is the phone number and contact email address where applicable. Two questions: Can/should we use robots.txt or robot meta tags to tell crawlers to ignore these subdomains, but obviously not the corporate domain? If question 1 is yes, would it be better for SEO to do that, or leave it how it is?
Intermediate & Advanced SEO | | EasyStreet0