Prevent indexing of dynamic content
-
Hi folks!
I discovered bit of an issue with a client's site. Primarily, the site consists of static html pages, however, within one page (a car photo gallery), a line of php coding:
dynamically generates a 100 or so pages comprising the photo gallery - all with the same page title and meta description. The photo gallery script resides in the /gallery folder, which I attempted to block via robots.txt - to no avail. My next step will be to include a:
within the head section of the html page, but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated by the call to the php script residing a bit further down on the page?
Dino
-
Hello Steven,
Thank you for providing another perspective. However, all factors considered, I agree with Shane's approach on this one. The pages add very little merit to the site and exist primarily to provide the site users with eye-candy (e.g. photos of classic cars).
-
Just personally, I would still deindex or canonical them - they are just pages with a few images - so not of much value and unless all titles and descriptions are targeting varying keywords and content is added, they will canabalize eachother, and possibly even drag down the site due to 100's of pages of thin content....
So actually from an SEO perspective it probably IS better to deindex or canonical 3 - 5 or so years ago, maybe the advice would have been keep them and keyword target - but not in the age of content
(unless the images were optimized for image searches for sale able products (but I do not think it is)
-
Hi Dino,
I know this won't solve the immediate problem you asked for, but wouldn't it be better for your client's site (and for SEO) to alter the PHP so that the title and meta data description are replaced with variables that can also be dynamic, depending on whichever of the 100 or so pages gets created?
That way, rather than worrying about a robot seeing 100 pages as duplicate content, it could see 100 pages as 100 pages.
-
It depends on how the pages are being created (I would assume it is off of a template page)
So within the template of this dynamically created page you would place
But if this is the global template - you cannot do this as it will noindex every page which of course is bad.
If you want to PM me the URL of the page I can take a look at your code, and see what is going on and how to recitify, as right now i think we are talking about the same principles, but different words are being used.
It really is pretty straightforward. (what I am saying) The pages that you want to be not indexed DO NOT need a nofollow they need a meta noindex
But there are many variables, as if you have already robot.txt disallowed the directory, then no bot will go there to get the updated noindex directive....
If there is no way to add a meta noindex then you need to nofollow and put in for a manual removal
-
I completely understand and agree with all points you have conveyed. However, I am not certain as to the best approach to "noindex" the urls which are being created dynamically from within the static html page? Maybe I am making this more complex than it needs to be...
-
So it is the pages themselves that are dynamically created you want out of index, not the page the contains the links?
If this is so ---
noindex the pages that are created dynamically
Therein lies the problem. I did have the nofollow directive in place specifying the /gallery/ folder, but apparently, the bots still crawled it.
Nofollow does not remove from index, it only tells the bot not to pass authority, as it is still feasible that the bot will crawl the link, so without the noindex, nofollow is not the correct directive due to the page (even though nofollowed) is still being reached and indexed.
PS. also if you have the nofollow on the links, you may want to remove it, so the bots will go straight through to the page and grab the noindex directive, but if you wanted to try to not let any authority "evaporate" you can continue to nofollow, but you may need to request the dynamically generated pages (URLS) be removed using webmaster tools.
-
The goal is to have the page remain in the index, but not follow any dynamically generated links on the page. The nofollow directive (in place for months) has not done the job.
-
?
If a link is coming into the page, and you have Noindex, Nofollow - this would remove from index and prevent the following of any links -
This is NOT instant, and can take months to occur depending on depth of page, crawl schedule ect... (you can try to speed it up by using webmaster tools to remove the URL)
What is the goal You are attempting to achieve?
To get the page out of index, but still followed?
Or remain in index, but just not follow links on page?
?
-
Therein lies the problem. I did have the nofollow directive in place specifying the /gallery/ folder, but apparently, the bots still crawled it. I agree that the noindex removes the page, but I wasn't certain if it prevented crawling of the page, as I have read mixed opinions on this.
I just thought of something else... perhaps an external url is linking to this page - allowing it to be crawled. I am off to examine this perspective.
Thanks for your response!
-
noindex will only remove from Index and dissallow the act of indexing the specific page (or pages created off template) you place the tag in upon the next page crawl.
Bots will still follow the page, and follow any links that are readable as long as there is not a nofollow directive.
I am not sure I fully understand the situation, so I would not say this is my "reccomendation" but an answer to the specific question.....
but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I be worried about our 'Duplicate' content
Hi guys... I've just been working through some issues to give our site a little cleanup. I'm working through our duplicate content issues (we have some legitimate duplicate pages that need removing, and some of our dynamic content is problematic. Are web developers are going to sort with canonical tags this week.) However... There are some pages that are actually different products, but are very similar pages that are 'triggering' MOZ to say we have duplicate pages. Here an example... http://www.toaddiaries.co.uk/filofax-refills/filo-12-month-inserts-personal-size/fortnight-view-filofax-personal and http://www.toaddiaries.co.uk/filofax-refills/filo-12-month-inserts-personal-size/week-to-a-view-filofax-personal They are very similar refill products, it's just the diary format is different. Question: Should I be worried about this? I've never seen our rankings change in the past when 'cleaning up' duplicate content. What do you guys think? Isaac.
On-Page Optimization | | isaac6630 -
Duplicate Content for Event Pages
Hi Folks, I have event pages for specific training courses running on certain dates, the problem I have is that MOZ indicates that I have 1040 duplicate content issues because I'm serving pages like this https://purplegriffon.com/event/2521/mop-practitioner I'm not sure how best to go about resolving this as, of course, although each event is unique in terms of it's start date, the courses and locations could be identical. Will Google penalise us for these types of pages, or will they even index them? Should I add a canonical link to the head of the document pointing to the related course page such as https://purplegriffon.com/courses/project-management/mop-management-of-portfolios/mop-practitioner. Will this solve the issue? I'm a little stuck on what to do for the best. Any advice would be much appreciated. Thanks. Kind Regards Gareth Daine
On-Page Optimization | | PurpleGriffon0 -
Should I have content on my home page or links to my articles
Hi, i have asked this question a couple of times without any luck so i am hoping third time lucky. My site www.in2town.co.uk has dropped in the rankings for two of my important keywords, lifestyle magazine and lifestyle news, so i am just wondering if i have to much content on the page for google to understand what the page is about. i am thinking to just have the links on my page instead of the intro to the articles, for example another online magazine does this, http://www.femalefirst.co.uk/ Can anyone please let me know if i should keep the intro to the articles or if i should go with the links idea like femalefirst does to help google understand that we are a lifestyle magazine any advice would be great
On-Page Optimization | | ClaireH-1848860 -
Duplicate Content - Delete it or NoIndex?
Last month I realized that one of my freelancers had been feeding my website with copied / spun content and sadly, there's lots of it. And of course it got my website to be hit hard by the last Panda update. Now that I've identified the content, what the best thing to do? Should I delete it permanently and get 404 errors or should I set the pages' robot meta tag to "nofollow"?
On-Page Optimization | | sbrault740 -
Duplicate Content for Spanish & English Product
Hi There, Our company provides training courses and I am looking to provide the Spanish version of a course that we already provide in English. As it is an e-commerce site, our landing page for the English version gives the full description of the course and all related details. Once the course is purchased, a flash based course launches within a player window and the student begins the course. For the Spanish version of the course, my target customers are English speaking supervisors purchasing the course for their Spanish speaking workers. So the landing page will still be in English (just like the English version of the course) with the same basic description, with the only content differences on that page being the inclusion of the fact that this course is in Spanish and a few details around that. The majority of the content on these two separate landing pages will be exactly the same, as the description for the overall course is the same, just that it's presented in a different language, so it needs to be 2 separate products. My fear is that Google will read this as duplicate content and I will be penalized for it. Is this a possibility or will Google know why I set it up this way and not penalize me? If that is a possibility, how should I go about doing this correctly? Thanks!
On-Page Optimization | | NiallTom0 -
Shall Google index a search result?
Hi, I've a website with about 1000 articles.Each article has one ore more keywords / tags. So I display these keywords at the article page and put a link to the intern search engine. (Like a tag cloud) The search engine lists als articles with the same keyword and creates a result page. This result page is indexed by Google. The search result contains the title of the article, a short description (150-300 chars.) and a link to the article. So, Google believes, that there are about 5.000 pages instead of 1.000 because auf the link to the search result pages. The old rule was for me: More pages in Google = better. But is this still true nowadays? Would be a "noindex, follow" better on these search result pages? (Is there a way to tell Google that this is a search result page?) Best wishes, Georg.
On-Page Optimization | | GeorgFranz0 -
Duplicate Content
What I can do to avoid the duplicate content on the index and in the categorys, I cant block my categorys, cause are pages with big autorithy, so what i can do ?
On-Page Optimization | | nafera20 -
Does Frequency of content updates affect likelyhood outbound links will be indexed?
I have several pages on our website with low pr, that also themselves link to lots and lots of pages that are service/product specific. Since there are so many outbound links, I know that the small amount of PR will be spread thin as it is. My question is, if I were to supply fresh content to the top level pages, and change it often, would that influence whether or not google indexes the underlying pages? Also if I supply fresh content to the underlying pages, once google crawls them, would that guarantee that google considers them 'important' enough to be indexed" I guess my real question is, can freshness of content and frequency of update convince google that the underlying pages are 'worthy of being indexed', and can producing fresh content on those pages 'keep google's interest', so to speak, despite having little if any pagerank.
On-Page Optimization | | ilyaelbert0