Can you see the 'indexing rules' that are in place for your own site?
-
By 'index rules' I mean the stipulations that constitute whether or not a given page will be indexed.
If you can see them - how?
-
Unfortunately, that would be specific to your own platform and server-side code. When you look at the SEOmoz source code, you're either going to see a nofollow or you're not. The code that drives that is on our servers and is unique to our build (PHP/Cake, I think).
You'd have to dig into the source code generating the Robots.txt file. I don't think you can have a fully dynamic Robots.txt (it has to have a .txt extension), so there must be a piece of code that generates a new Robots.txt file, probably on a timer. It could be called something similar, like Robots.php, Robots.aspx, etc. Just a guess.
FYI, dynamic Robots.txt could be a little dicey - it might be better to do this with a META NOINDEX in the header of the user profile pages. That would also avoid the timer approach. The pages would dynamically NOINDEX themselves as they're created.
-
To hopefully clarify what I'm talking about, I want to provide this example: SEOmoz will remove the "no-follow" tag from the first link in your profile if you get 200 mozpoints.
This is a set rule which I believe will automatically occur once a user reaches the minimum. On my site, a similar rule exists where the meta noindex tag will be removed from a user page if you submit 10 'files'.
There were other rules similar to this created and I need to know what they are. How?
-
On my site, there was a rule created where users are blocked by robots unless they have submitted a minimum number of 'files'. This was done to ensure that only quality user profile pages are being indexed and not just spam/untouched profiles.
There have been other rules like this created but I don't know what they are and I'd like to find out.
-
Hi David,
Do you mean how robots.txt is configured and if the robots file is blocking a certain page from being indexed? If so, yes. If the file is complex and you're not sure if it's blocking a particular page, you can go into Google Webmaster Tool and they have a robots.txt utility where you can input a particular URL and it will tell you if the robots.txt file you are using (or proposing) blocks that URL.
If you mean whether the page is quality enough for a search engine to choose to index it? No, that's part of the algorithm and none of the major engines are that nice and open.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way of crawling my entire site to get a list of NoFollow links?
Hi all, hope somebody can help. I want to crawl my site to export an audit showing: All nofollow links (what links, from which pages) All external links broken down by follow/nofollow. I had thought Moz would do it, but that's not in Crawl info. So I thought Screaming Frog would do it, but unless I'm not looking in the right place, that only seems to provide this information if you manually click down each link and view "Inlinks" details. Surely this must be easy?! Hope someone can nudge me in the right direction... Thanks....
Intermediate & Advanced SEO | | rl_uk0 -
Staging/Development Site Indexed?
So, my company's site has been pretty tough to try to get moving in the right direction on Google's SERPs. I had believed that it was mainly due to having a shortage of back links and a horrible home page load time. Everything else seems to be set up pretty well. I was messing around and used the site: Google search operator for our staging site. I found stage.site.com and a lot of our other staging pages in the search results. I have to think that this is the problem and causing a duplicate content penalty of the entire site. I guess I now need to 301 redirect the entire site? Has anyone every had this issue before and have fixed it? Thanks for any help.
Intermediate & Advanced SEO | | aua0 -
Blocking Certain Site Parameters from Google's Index - Please Help
Hello, So we recently used Google Webmaster Tools in an attempt to block certain parameters on our site from showing up in Google's index. One of our site parameters is essentially for user location and accounts for over 500,000 URLs. This parameter does not change page content in any way, and there is no need for Google to index it. We edited the parameter in GWT to tell Google that it does not change site content and to not index it. However, after two weeks, all of these URLs are still definitely getting indexed. Why? Maybe there's something we're missing here. Perhaps there is another way to do this more effectively. Has anyone else ran into this problem? The path we used to implement this action:
Intermediate & Advanced SEO | | Jbake
Google Webmaster Tools > Crawl > URL Parameters Thank you in advance for your help!0 -
After Receiving a "Googlebot can't access your site" would this stop your site from being crawled?
Hi Everyone,
Intermediate & Advanced SEO | | AMA-DataSet
A few weeks ago now I received a "Googlebot can't access your site..... connection failure rate is 7.8%" message from the webmaster tools, I have since fixed the majority of these issues but iv noticed that all page except the main home page now have a page rank of N/A while the home page has a page rank of 5 still. Has this connectivity issues reduced the page ranks to N/A? or is it something else I'm missing? Thanks in advance.0 -
SEO Tools You Can't Live Without?
Hi Guys, I'm currently in the middle of creating a comprehensive blog post covering SEO Tools that I wouldn't be able to work without. So far I've got the following down, as I use these on a day to day basis and they make my job infinitely easier. SEOMoz / OSE AHrefs BuzzStream Scrapebox Xenu / Screaming Frog Excel GWT / Analytics / Adwords Keyword Tool What tools or subscriptions do you use on a daily basis and couldn't be without?
Intermediate & Advanced SEO | | SebastianCowie2 -
Site Indexed by Google but not Bing or Yahoo
Hi, I have a site that is indexed (and ranking very well) in Google, but when I do a "site:www.domain.com" search in Bing and Yahoo it is not showing up. The team that purchased the domain a while back has no idea if it was indexed by Bing or Yahoo at the time of purchase. Just wondering if there is anything that might be preventing it from being indexed? Also, Im going to submit an index request, are there any other things I can do to get it picked up?
Intermediate & Advanced SEO | | dbfrench0 -
Is My site seeing a Google Dance ? - Rankings all over the place
My eCommerce website is seeing some rankings flucuate daily from say rank 20 to rank 130 whilst some other keywords ago up and down by as much as 40 places. I have been putting up alot of new unique content and can see in GWT that google has been crawling my site more but given that I was affected by the google panda updates which saw a 40% drop in traffic I'm only just to recover some of it., i have also been trying to get rid of any poor links and our linking building is only concentrating on high quality posts and links. I am wondering if this is the post panda update - "google dance" or is google having issues trying to work where to rank my site and possibly punish me? thanks Sarah.
Intermediate & Advanced SEO | | SarahCollins0 -
Can your site be penalized for changing the url structure and if so how long till you get back?
I'm doing well on yahoo and bing and the only reason I can think of for why I'm not showing on Google is because I changed the url structure a couple of months ago. I have solid on and off page done for this site.
Intermediate & Advanced SEO | | deciph220