I have more pages in my site map being blocked by the robot file than I have being allowed to be crawled. Is Google going to hate me for this?
-
Using some rules to block all pages which start with "copy-of" on my website because people have a bad habit of duplicating new product listings to create our refurbished, surplus etc. listings for those products. To avoid Google seeing these as duplicate pages I've blocked them in the robot file, but of course they are still automatically generated in our sitemap. How bad is this?
-
When you say "people," are you saying your own web team duplicates content to make their job easier? Or am I missing something?...
If that's the case, you really should create unique URL's with unique page titles, product info, etc. That's the correct way to avoid getting hit for duplicate content - don't create it. It seems like what you're doing now is more of a band-aid solution to the problem.
I'd consider that even though creating unique content in situations like this can seem daunting and/or be more expensive, there's probably huge long-term gains to made if you do it right.
-
It is not bad, just not best practices because Google will still index the URL's if they are mentioned on other pages. Just to quote them:
"While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information..."
What I would do instead is either use rel="canonical" or 301 redirects. I hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Wrong page ranking on SERP, above more relevant page
Often I will see the wrong page, something less relevant to a particular search, appear higher on the SERP than a more relevant page. Why does this happen and how can it be remedied? I found this Moz article, has anything been written on this topic more recently. Thanks! https://moz.com/blog/wrong-page-ranking-in-the-results-6-common-causes-5-solutions
On-Page Optimization | | NicheSocial0 -
Why is Google replacing my meta title with the business name on home page?
For all queries that return the home page, Google is not showing my meta title. Instead it replaced it with the official business name which of course makes it harder to rank for key terms since they don't exist now in the meta title. You can see this is you search on "mt view estate planning attorney". The site in question is dureelaw.com and the title showing is "The Law Office of Daniel L. DuRee." View the source and you'll see my meta title. Why is Google substituting it?
On-Page Optimization | | katandmouse0 -
New google serps page design
hi i know title length displayed is now based on pixels rather than character but still thought safe to have titles up to 70 characters long before they are truncated i see that on the new G serps designed pages titles that were showing in full on old design (without truncation) are now being truncated. As in same title shows fine (displays in full) on old design serps but truncated on new designed page Anyone else notice this ? Cheers Dan
On-Page Optimization | | Dan-Lawrence1 -
Blocked By Meta Robots
Hi I logged in the other day to find that over night I received 8347 notices saying certain pages are being kept out of the search engine indexes by meta-robots. I have not changed my robots.txt in years and I certainly didn't block Google from visiting those pages. Is this a fault on Roger Mozbot behalf? Or is there a bot preventing 8000+ pages being indexed? Is there a way to find out what meta-robot is doing this and where? And how I can get rid of it? I usually rank between #3 and #5 for the term 'sex toys' on google.com.au, but I now rank #7 to #9 so it would seem some of my pages/content is being blocked. My website is www.theloveshop.com THIS IS AN ADULT TOYS SITE. There is no porn videos or anything like that on it, but just in case you don't wish to look at sex toys or are around kids I thought I would mention it. Blake
On-Page Optimization | | wayne10 -
Source page leading to a 404 pages in reports
Hi everybody, I wonder how to find and quickly correct 404 errors in my crawl reports : SeoMoz says me "http://domain.com/this-page-is-dead" is 404, but I can't figure out a source page where a link to that url appears. I tried a google link:http://domain.com/this-page-is-dead request, with no more luck. I imagine the trick is trivial, but I need it 🙂 Moreover, why do not show a list of pages referring to this 404 page on reports ? Thanks, Loïc
On-Page Optimization | | mandinga0 -
Pages not cached
Sorry for all the questions. I have dozens of article pages that are not cached by google. How can I get them cached?
On-Page Optimization | | azguy0 -
Home page ranking dropped below internal pages
The index page for a site I manage has dropped significantly - internal pages rank above it. It's a new site, 2 months old but was ranking at 1st. Any suggestions as to how I can debug this?
On-Page Optimization | | OptioPublishing0