100K Webmaster Central Not Found Links?
-
http://screencast.com/t/KLPVGTzM I just logged into our Webmaster Central account to find that it shows 100k links that are not found? After searching through all of them they all appear to be from our search bar, with no results? Are we doing something wrong here?
-
Ya, I read through that article yesterday & see that they recommend the same setting as the Yoast plugin should be doing? Although I didn't ever get a response from me to see if there is something missing?
For now, I plan on adding this to the robots.txt file & see what results I get?
Do you know the time frame that it takes to get the updates in GWT? Will this update within a few weeks or would it take longer than that?
Thanks for all the help!
BJ
-
Hello BJ.
The robots.txt file must be on your server, in the document root.
Here is information about how to configure robots.txt
Note that is does have a warning at the end, about how you could possibly lose some link juice, but that is probably a much smaller problem than the problem you are trying to fix.
Nothing is perfect, and with the rate that google changes its mind, who knows what is the right thing to do this month.
Once you have edited robots.txt, you don't need to do anything.
- except I just had a thought - how to get google to remove those items from your webmaster tools. I think you should be able to tell them to purge those entries from GWT. Set it so you can see 500 to a page and then just cycle through and mark them fixed.
-
Sorry to open this back up after a month, in adding this to the robot.txt file is there something that needs to be done within the code of the site? Or can I simply update the robots.txt file within Google Webmaster Tools?
I was hoping to get a response from Yoast on his blog post, it seems there were a number of questions similar to mine, but he didn't ever address them.
Thanks,
BJ
-
We all know nothing lasts forever.
A code change can do all kinds of things.
Things that were important are sometimes less important, or not important at all.
Sometimes yesterdays advice no longer is true.
If you make a change, or even if you make no change, but the crawler or the indexer changes, then we can be surprised at the results.
While working on this other thread:
http://www.seomoz.org/q/is-no-follow-ing-a-folder-influences-also-its-subfolders#post-74287
I did a test and checked my logs. A nofollow meta tag and a nofollow link do not stop the crawlers from following. What it does (we think) is to not pass pagerank. That is all it does.
That is why the robots.txt file is the only way to tell the crawlers to stop following down a tree. (until there is another way)
-
Ok, I've posted a question on Yoast.com blog to see what other options we might have? Thanks for the help!
-
It is because Roger ignores those META tags.
Also, google often ignores them too.
The robots.txt file is a much better option for those crawlers.
There are some crawlers that ignore the robots file too, but you have no control over them unless you can put their IPs in the firewall or add code to ignore all of their requests.
-
Ok, I just did a little more research into this, to see how Yoast was handling this within the plugin & came across this article: http://yoast.com/example-robots-txt-wordpress/
In the article he stats that this is already included within the plugin on search pages:
I just confirmed this, by doing this search on my site & looking at the code: http://www.discountqueens.com/?s=candy
So this has always been in place. Why would I still have the 100K not found links still showing up?
-
We didn't have these errors showing up previously, so that's why I was really suspicious? Also we have Joost De Valk's SEO plugin installed on our site & I thought there was an option to turn off the searches from being indexed?
-
Just to support Alan Gray's response, I'll say it's very important to block crawlers from your site search, because it not only throws errors (bots try to guess what to put in a search box), but also because any search results that get into the index will cause content conflicts, dilute ranking values, and worst case scenario, potentially create the false impression that you have a lot of very thin content / near duplicate content pages.
-
the search bar results are good for searchers but not for search engines. You can stop all search engines and Roger (the seomoz crawler) from going into those pages by adding an entry to your robots.txt file. Roger only responds to his own section of the robots file, so anything you make global will not work for him.
User-agent: rogerbot Disallow: /search/*
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Breadcrumbs versus in context link
Hi, I remember reading that links within the text have more value than breadcrumbs links for example because in context links are surrounded by the right content (words) but google search engine optimisation starter guide says breadcrumbs are good, so which one is recommended ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Should we optimise our internal links?
Hi again, We recently had a technical search audit done by a specialist agency and they discovered a number of internal links that caused redirects to happen. The agency has recommended we update all of these links to link directly to the destination so we don't lose out on link equity. We'd just like to know if you think this would be a worthwhile use of our time. Our web team seem to think that returning a 301 to a crawler means that the crawler will stop indexing the original URL and instead index the redirected destination? Thanks all. Clair
Intermediate & Advanced SEO | | iescape2 -
Best possible linking on site with 100K indexed pages
Hello All, First of all I would like to thank everybody here for sharing such great knowledge with such amazing and heartfelt passion.It really is good to see. Thank you. My story / question: I recently sold a site with more than 100k pages indexed in Google. I was allowed to keep links on the site.These links being actual anchor text links on both the home page as well on the 100k news articles. On top of that, my site syndicates its rss feed (Just links and titles, no content) to this page. However, the new owner made a mess, and now the site could possibly be seen as bad linking to my site. Google tells me within webmasters that this particular site gives me more than 400K backlinks. I have NEVER received one single notice from Google that I have bad links. That first. But, I was worried that this page could have been the reason why MY site tanked as bad as it did. It's the only source linking so massive to me. Just a few days ago, I got in contact with the new site owner. And he has taken my offer to help him 'better' his site. Although getting the site up to date for him is my main purpose, since I am there, I will also put effort in to optimizing the links back to my site. My question: What would be the best to do for my 'most SEO gain' out of this? The site is a news paper type of site, catering for news within the exact niche my site is trying to rank. Difference being, his is a news site, mine is not. It is commercial. Once I fix his site, there will be regular news updates all within the niche we both are in. Regularly as in several times per day. It's news. In the niche. Should I leave my rss feed in the side bars of all the content? Should I leave an achor text link on the sidebar (on all news etc.) If so: there can be just one keyword... 407K pages linking with just 1 kw?? Should I keep it to just one link on the home page? I would love to hear what you guys think. (My domain is from 2001. Like a quality wine. However, still tanked like a submarine.) ALL SEO reports I got here are now Grade A. The site is finally fully optimized. Truly nice to have that confirmation. Now I hope someone will be able to tell me what is best to do, in order to get the most SEO gain out of this for my site. Thank you.
Intermediate & Advanced SEO | | richardo24hr0 -
Is this link SEO-Friendly?
Hi Mozzers, Was wondering if someone could tell me if this link is SEO-friendly? class = "sl">name="sc" type="checkbox" value="1449"><a <span="">href</a> <a <span="">="</a>http://www.example.com/" onclick = "Javascript: return dosc(2);">src="imsd/coff.gif" id="cbsc2"/>Keyword It has some Javascript that makes the link work like a filter. Cheers, Carlos
Intermediate & Advanced SEO | | Carlos-R0 -
Real impact of canonical links?
I am responsible for 2 e-commerce websites. SEO Moz and Google Web Master tools both inform me regularly that on both sites there are many instances of duplicate titles, headings, decriptions and page content. Obviously from an SEO point of view I am more than a little concerned about this! Out product pages struggle to perform strongly despite the fact that our website is of a decent quality and we are leaders in our field. Our competitors rank above us when they add a product page, whereas we normal flit in between 8-10 or on the 2nd SERP. I know it is hard without viewing the site, but is duplicate content likely to be a strong, leading factor in this? I think it is, but want to put together a business case to spend the cash to sort it out....just need someone confirmation that this is worth sorting as a priority. Here are 2 examples of what I mean: 1) Category pages www.exampledomain.co.uk/category1.aspx We have filters on our category page (so the customer can sort products based on their price, colour, size etc.). When filters are used a new URL is generared. www.exampledomain.co.uk/category1.aspx?prices=0||10 www.exampledomain.co.uk/category1.aspx?prices=10||20 The content, titles, description is the same although the links are different. Do I need to set up a canonical tag on the page that reads: 2) Product pages Product pages on the websites have different URLs depending on how to arrive on them. You get 1 URL if you navigated to the page via the website navigation, but you get another different URL if you used the website search functionality to find the page. Example: Search link: www.exampledomain.co.uk/category1/Product1.aspx Navigation link: www.exampledomain.co.uk/12345/category1/Product1.aspx Again, do I need to set up a canonical tag for 1 of these link types so that the link benefit is not shared over 2 pages? Any feedback would be welcome! At the moment the ability to add canonical tags is locked down by our CMS (I know, rubbish!)...so website development would be needed - hence the need for a business case!
Intermediate & Advanced SEO | | DHS_SH0 -
How to ping the links
When i do link building for my website, how can i let the search engines know about that. is there any way of pinging?
Intermediate & Advanced SEO | | raybiswa0 -
Links to Facebook pages
I would like to ask if anyone has any knowledge regarding linking to a company's facebook page. I have built a few links to a client's facebook page in an effort to have it rank better in SERPs. I just learned that unlike twitter and linkedin, it is apparently not possibly to directly link to facebook pages. At least it is not possible from a search engine's perspective. If you follow any facebook page link while you are not logged into facebook, you are redirected to the facebook home page. I can't think of any way around this obstacle. I'd love some clever solution such as providing a URL which includes a basic dummy facebook login but there is nothing I am aware of to achieve this result. Does anyone have any ideas on this topic?
Intermediate & Advanced SEO | | RyanKent0 -
Strange Linking Data in Webmaster Tools
I run a site that was a Wordpress blog with Edirectory software for a directory on the back end. I've scrapped the Edirectory and built the entire site on Wordpress. After the site change I'm seeing about 700 404 Not Found crawling errors, which appear to be old Edirectory pages that no longer exist. My understanding is that they'll cycle out eventually. What troubles me is the linking data I'm seeing. In the "Links to My Site" area of Webmaster tools, I'm seeing 4,430 links to the "About" page, another 2,900 to an obscure deleted directory listing page and only 2,050 to the home page. I show 1,700 links to a terms and conditions pdf and other strange data. To summarize, I'm showing huge numbers of links to obscure pages. Any help would be greatly appreciated.
Intermediate & Advanced SEO | | JSOC0