Get a list of robots.txt blocked URL and tell Google to crawl and index it.
-
Some of my key pages got blocked by robots.txt file and I have made required changes in robots.txt file but how can I get the blocked URL's list.
My webmaster page Health>blocked URL's shows only number not the blocked URL's.My first question is from where can I fetch these blocked URL's and how can I get them back in searches,
One other interesting point I see is that blocked pages are still showing up in searches.Title is appearing fine but Description shows blocked by robots.txt file.
I need urgent recommendation as I do not want to see drop in my traffic any more.
-
"changing the lastmod of those pages to today".
How can I make these changes?
Right now the news is that Resubmitted the Sitemap and no warnings this time.
-
I imagine that since you've got a robots txt error you'll probably ended closing a whole directory to bots which you wanted to be indexed. You can easily spot the directory and resubmit a sitemap to google changing the lastmod of those pages to today and the priority to 1 but only of those pages.
If you still receive warnings it may be due to errors in your sitemap. You're probably including some directory you don't want. You can try it in GWT putting in the box at the bottom the url you want to maintain in the index and then trying to see if some urls are being blocked by your robots.
If you want you can post here your robots and the URIs you want to be indexed without knowing the domain so that won't be public. Hope this may help you
-
Ok Resubmitted it.but even with updated file it gives a lot of errors.I think it takes some time.20,016 warnings
I have not added no index attribute in my header region.It was all messy stuff with robots.txt file.It means that with site showing up in SERP the rank will probably be the same or it was deranked?
-
Go into GWMT and resubmit sitemap.xml files (with the URLs you want indexed) for recrawling and Google will digest the sitemaps again, instead of waiting for Googlebot to come around on their own, you are requesting it to come around, also include those new sitemap files in your robots.txt file.
-
In Google Webmaster Tools, go to Health -> Fetch As Google. Then add the previously blocked URL and click Fetch. Once you've done that, refresh the page and click "Submit to index". That should get Google indexing those pages again.
Getting external links to your pages also helps get pages crawled & indexed, so it may be worth submitting your pages to social bookmarking sites, or get other types of backlinks to your previously blocked pages if possible.
-
Since you fixed your robots.txt file you should be good to. It will probably take a few days for Google to recrawl your site and update the index with the URLs they are now allow to crawl.
Blocked URLs can still show up in SERPs if you haven't defined the no-index attribute in your section.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Analytics Not Properly Attributing Goals
My GA has been working fine for years and is now suddenly attributing 100% of my goals/conversions to a "referral" source. In this case "crm.zoho.com". (see attached url) Our contact form is a zoho CRM form. When prospect fills it out online, it dumps them to a thank-you page and the conversion is counted. That all works just fine, but it is not attributing the conversion to Organic or PPC or Direct as it was a month or so ago. I'm not sure if this may be the cause, but I was cleaning up GA about a month ago, deleting some filters I didn't think I needed any more. Thank you for your help. 9CRUx
Reporting & Analytics | | sanctuary2420 -
Visits from Google with ccTLD are showing as referrals
Hi!I was seeing on of my clients Analytics report and it shows that some of the main sites that send visits and get tagged as referral traffic are google.com.br, google.cl, google.com.ar, among others. Do you know why is this happening? Shouldn't they get tagged by default as organic?
Reporting & Analytics | | arielbortz0 -
Index.php and /
Hello, We have a php system and in the MOZ error report our index.php shows up as a duplicate for / (home page). I instituted a rel canonical on the index.php because the / gets better rank than the other. This said, the error report through MOZ still shows them as duplicates. Should I be using a 301 instead? Please help! Also, I would love a good technical SEO book (for bridging the gap between SEO and programmer) if someone can recommend one? Thanks in advance!
Reporting & Analytics | | lfrazer0 -
Major practices which helps to index pages by google.
Actually, We have submitted more than 100 pages in to google through xml sitemap. But, we see in that 75% of the pages where indexed by google. Note : Excluding the duplicate pages
Reporting & Analytics | | Webworld_Norway0 -
Google Analytics Goals are not recording
Hello! I've set up Goals within Google Analytics and they aren't registering. I'm wondering if someone can review what I've set up to see if I've entered something wrong. I'm trying to track enrollments. Here's the site info: Enroll URL: http://www.careplusdentalplans.com/individuals/enroll-careplus/ Upon completion, the user goes to a Thank You page and our CMS generates a trailing URL parameter. For example: http://www.careplusdentalplans.com/index.php/individuals/enroll-careplus-thank-you/#123456789 My Goal info in GA: Goal URL: /individuals/enroll-careplus-thank-you/
Reporting & Analytics | | SmileMoreSEO
Match Type: Head Match (to disregard trailing URL parameter)
Goal Funnel: Step 1: /individuals/enroll-careplus/ I'm curious if the problem lies in either: The URL: Notice how the URL adds /index.php/ on the thank you page. In that case, should I enter the full URL in the Goal instead of the truncated URL? The Funnel: Is it necessary to show the initial Enroll page or if I'll see that in the Flow Visualization regardless? Thanks in advance for helping me out. Erik0 -
URL Re-Structure - Tracking success of it
Hi guys, I was wondering what would be the best approach to track the success of a URL restructure? What we plan to do is to implement the URL re-structure slowly by only having it on new pages which go live for property listings. Any previous listings will use the old URL structure. I thought it would be best to limit any potential problems by testing it on a smaller number of pages. So my question really is, what metrics should I be looking at to determine the success of this given the fact that we remove any property listings once they get rented or sold?
Reporting & Analytics | | MarkScully0 -
How do I best segment tablets on Google Analytics
I would like to find a way to best segment out my tablet traffic to measure performance; however I'm finding that there are road blocks. It doesn't seem that device operating systems or screen resolutions have clear cut differences in the tablet/mobile versions. Has anyone here found a good way to create a "tablet" segment in Google Analytics? Right now I'm having to lean on solely the ipad traffic to get indicators of tablet performance. Thanks!
Reporting & Analytics | | lvstrickland0 -
Do Google penalise you for having too many 404's?
Hi There I have been doing some work reducing the number of 404's displayed in the Crawl Errors found in Googles Webmaster Tools. We had a lot of products that were no longer available so have now been removed to reduce the number of 404s that had been found. However, there are a number of URLs that have been crawled that do not exist on our website and have been flagged in the list of Crawl Errors. I want to know if Google will penalise us for this, perhaps affecting our quality score or if they can see that this is something out of our control. This site for example: http://sibd.com/com_offers_unique_gifts.html has generated a lot of truncated URLs on its site that link to pages that don't exist on our site: http://www.arenaflowers.com/flowers/pri… That is the exact link that it is trying to locate. Here is the report for that particular link. As you can see the content has been scraped by other sites which has spread the problem further. Pages that link to http://www.arenaflowers.com/flowers/pri.. URL Discovery Date
Reporting & Analytics | | ArenaFlowers.com
http://www.justsearchit.com.au/for_flowers_offers,3.html
Sep 12, 2011
http://sibd.com/offers_unique_gifts_for.html
Sep 11, 2011
http://sibd.com/offers_unique_gifts.html
Sep 11, 2011
http://sibd.com/com_offers_unique_gifts_for.html
Sep 11, 2011
http://www.flexfinder.com/flowers_offers_unique_gifts.html
Sep 10, 2011
http://sibd.com/offers_unique_gifts_with.html
Sep 10, 2011
http://sibd.com/com_offers_unique_gifts.html
Sep 10, 2011
http://sibd.com/of_flowers_offers_from.html
Sep 9, 2011
http://arama.frmpc.com/for_flowers_for_less_ltd_includes.html
Sep 9, 2011
http://arama.frmpc.com/flowers_for_less_than_do.html
Sep 9, 2011 I have spotted a lot of these and currently have around 3.3K 404s in total, a majority are from sites we don't control. Is there an acceptable number of 404s a site should aim for and is the above something we should address or are Google smart enough to work out that we can't fix this ourselves? Thanks! Sam.0