"INDEX,FOLLOW" then later in the code "NOINDEX,NOFOLLOW" which does google follow?
-
background info: we have an established closed E-commerce system which the company has been using for years. I have only just started and reviewing the system, I don't have direct access to the code, but can request changes, but it could take months before the changes are in effect (or done at all), and we won't can't change to a new E-commerce system for the short to mid term.
While reviewing the site (with help of seomoz crawl diagnostics) I noticed that some of the existing "landing pages" have in the code:
<meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">INDEX,FOLLOW</a>" /> then a few lines later
<meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">NOINDEX,NOFOLLOW</a>" />
Which the crawl diagnostics flagged up, but in the webmaster tools says
"We didn't detect any issues with non-indexable content on your site."so the question is which instructions does google follow? the first or 2nd?
note: clearly this is need fixed, but I have a big list of changes for the system so I need to know how important this is
tthanks
-
I've never actually had any errors listed for non-indexable content in the HTML Improvements section of WMT. So I'm not 100% sure what would set off that notification. Though the sites I work on do have a number of pages that are NoIndex and/or NoFollow. So i guess the issue would be caused not by purposefully blocking the page but some other means that makes your page unable to be crawled properly.
-
Yeah I did that after posting the question I started test like that, but its not coming up and searching the url does not show the page, but other normal pages ("lower" pages) are showing (that don't have this problem), so it seems that it is de-indexed those pages.
its weird that webmaster tools say ""We didn't detect any issues with non-indexable content on your site.", when there are.
Getting this sorted one way or another is my top priority
-
If you copy a string of text on the page and paste it into google search, does your page show up in the results? If so, then its being indexed despite the second robots tag. If it doesn't show up, then its not being indexed. So importance would rely on whether you want that page to be indexed and whether or not it is being indexed. Either way, you should look into cleaning that up at some point.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
Get List Of All Indexed Google Pages
I know how to run site:domain.com but I am looking for software that will put these results into a list and return server status (200, 404, etc). Anyone have any tips?
Technical SEO | | InfinityTechnologySolutions0 -
Will Google Recrawl an Indexed URL Which is No Longer Internally Linked?
We accidentally introduced Google to our incomplete site. The end result: thousands of pages indexed which return nothing but a "Sorry, no results" page. I know there are many ways to go about this, but the sheer number of pages makes it frustrating. Ideally, in the interim, I'd love to 404 the offending pages and allow Google to recrawl them, realize they're dead, and begin removing them from the index. Unfortunately, we've removed the initial internal links that lead to this premature indexation from our site. So my question is, will Google revisit these pages based on their own records (as in, this page is indexed, let's go check it out again!), or will they only revisit them by following along a current site structure? We are signed up with WMT if that helps.
Technical SEO | | kirmeliux0 -
Rel="canonical" in hyperlink
Inside my website, I use the rel = "canonical" but I do not use it in the but in a hyperlink. Now it is not clear to me if that goes well. See namely different stories about the Internet. My example below link: Bruiloft
Technical SEO | | NECAnGeL0 -
Website is not indexed in Google
Hi Guys, I have a problem with a website from a customer. His website is not indexed in Google (except for the homepage). I could not find anything that can possibly be the cause. I already checked the robots.txt, sitemap, and plugins on the website. In the HTML code i also couldn't find anything which makes indexing harder than usual. This is the website i am talking about: http://www.xxxx.nl/ (Dutch) The only thing that i am guessing now is the Google sandbox, but even that is quite unlikely. I hope you guys discover something i could not find! Thanks in advance 🙂
Technical SEO | | B.Great0 -
Noindex, nofollow on a blog since 2009
Just reviewed a WordPress blog that was launched in 2009 but somehow the privacy setting was to not index it, so all this time there's been a noindex, nofollow meta tag in the header. The client couldn't figure out why masses of content wasn't showing up in search results. I've fixed the setting and assume Google will spider in short order; the blog is a subdirectory of their main site. My question is whether there is anything else I can or should do. Can Google recognize the age of the content, or that it once had a noindex meta tag? Will it "date" the blog as of today? Has the client lost out on untold benefits from the long history of content creation? I imagine that link juice from any backlinks to the blog will now flow back to the main site; think that's true? Just curious what others might think of this scenario and whether any other action is warranted.
Technical SEO | | vickim0 -
Will Google index a 301 redirect for a new site?
So here is the problem... We have setup a 301redirect for our clients website. When you search the clients name it comes up with the old .co.uk website. We have made this redirect to the new .com website. However on the SERPs when it shows the .co.uk it shows the old title pages which currently say 'Holding Page'. When you click on that link it takes you to the fully functioning .com website. My question is, will the title tags in the SERPs which show the .co.uk update to the new ones from the .com? I'm thinking it will be just a case of Google catching up on things and it will sort itself out eventually. If anyone could help I would REALLY appreciate it. Thanks Chris
Technical SEO | | Weerdboil0 -
Getting Google to index new pages
I have a site, called SiteB that has 200 pages of new, unique content. I made a table of contents (TOC) page on SiteB that points to about 50 pages of SiteB content. I would like to get SiteB's TOC page crawled and indexed by Google, as well as all the pages it points to. I submitted the TOC to Pingler 24 hours ago and from the logs I see the Googlebot visited the TOC page but it did not crawl any of the 50 pages that are linked to from the TOC. I do not have a robots.txt file on SiteB. There are no robot meta tags (nofollow, noindex). There are no 'rel=nofollow' attributes on the links. Why would Google crawl the TOC (when I Pinglered it) but not crawl any of the links on that page? One other fact, and I don't know if this matters, but SiteB lives on a subdomain and the URLs contain numbers, like this: http://subdomain.domain.com/category/34404 Yes, I know that the number part is suboptimal from an SEO point of view. I'm working on that, too. But first wanted to figure out why Google isn't crawling the TOC. The site is new and so hasn't been penalized by Google. Thanks for any ideas...
Technical SEO | | scanlin0