How to get a list of robots.txt file
-
This is my site.
Its in wordpress.I just want to know is there any way I can get the list of blocked URL by Robots.txt
In Google Webmaster its not showing up.Just giving the number of blocked URL's.
Any plugin or Software to extract the list of blocked URL's.
-
If you use Bing Webmaster tools you can see a complete list all URLs blocked by robots.txt. You can export the file and then filter.
Just go to Reports & Data > Crawl Information within your Bing webmaster account. I am not aware of this feature being in Google webmaster tools. Hope this helps.
-
simon_realbuzz buddy If I use this /classifieds/ it means I am blocking all URL starting with it.I want to get a list of all blocked URL's of site.
Example
http://muslim-academy.com/classifieds/
How many URL's associated with this classified are blocked by my robots.txt.
-
I'm sorry I don't follow. If you go to that URL you will see the list of blocked URLs as I've pasted below.
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /forum/viewtopic.php?p=
Disallow: /forum/viewtopic.php?=&p=
Disallow: /forum/viewtopic.php?t=
Disallow: /forum/viewtopic.php?start=
Disallow: /forum/&view=previousDisallow: /forum/&view=next
Disallow: /forum/&sid=
Disallow: /forum/&p=
Disallow: /forum/&sd=a
Disallow: /forum/&start=0
Disallow: /forum/memberlist.php
Disallow: /forum/posting.php
Disallow: /classifieds/
Disallow: /forum/index.php
Disallow: /forum/ucp
Disallow: /http://muslim-academy.com/الا�%A..
Disallow: /http://muslim-academy.com/особенн%D
Disallow: /http://muslim-academy.com/ислам-ка%
Disallow: /http://muslim-academy.com/classifieds/ads/Disallow: /http://muslim-academy.com/значени%D..
Disallow: /.ifieds/
Disallow: /.ifieds/ads/
Disallow: /forum/alternatelogin/al_tw_connect.php?authentication=1
Disallow: /forum/search.php -
simon_realbuzz I need a list of blocked URL's not the robots.txt file path.
-
You can view your robots file simply by appending /robots.txt to your site URL. Just put the following http://muslim-academy.com/robots.txt and you'll be able to view your robots file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Link juice am i getting it or not?
We have a newspaper source that links to a domain that forwards to a page on our website. Do we get link juice from that? Go here: https://dailygazette.com/ click on "Classifieds" then "CapRegion Homes" the button goes to the website www.capregionhomes.com which forwards to our main website, a page we built for them. My question is: Because the link from this site first goes to another site that auto-forwards to us are we actually getting any link juice from the daily gazette on this or is that first site getting all the juice & we technically are getting link juice from capregionhomes.com ?
Reporting & Analytics | | Cfarcher0 -
Google Webmaster indicates robots.text access error
Seems that Google has not been crawling due to an access issue with our robots.txt
Reporting & Analytics | | jmueller0823
Late 2013 we migrated to a new host, WPEngine, so things might have changed, however this issue appears to be recent. A quick test shows I can access the file. This is the Google Webmaster Tool message: http://www.growth trac dot com/: Googlebot can't access your site January 17, 2014 Over the last 24 hours, Googlebot encountered 62 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 8.8% Note the above message says 'over the last 24 hours', however the date is Jan-17 This is the response from our host:
Thanks for contacting WP Engine support! I looked into the suggestions listed below and it doesn't appear that these scenarios are the cause of the errors. I looked into the server logs and I was only able to find 200 server responses on the /robots.txt. Secondly I made sure that the server wasn't over loaded. The last suggestion doesn't apply to your setup on WP Engine. We do not have any leads as to why the errors occurred. If you have any other questions or concerns, please feel free to reach out to us. Google is crawling the site-- should I be concerned? If so, is there a way to remedy this? By the way, our robots file is very lean, only a few lines, not a big deal. Thanks!0 -
Confirmation page gets hit multiple times by some users. How I can I segment out unique visits?
Hi All, I'm web marketing manager at http://www.evenues.com which is like an AirBnB for meeting space. When calculating the number of bookings for our meeting spaces, I've set up a goal in analytics with the confirmation page as the goal URL. The problem is, it seems that some users are looking at the same confirmation page several times. We have unique URLs for each confirmation page, but some users seem to be visiting these unique pages more than 2 to 5 times. This skews our numbers a bit. This makes things a bit problematic when it comes to segmenting visitors. is there anything we can so that each unique URL visited only counts once? Thanks, Kenji
Reporting & Analytics | | eVenuesSEO0 -
Should I delete a page that gets search traffic, that I don't care about?
I have a page on my site that consistently gets traffic, every month. Googlers seems to love it. But I don't like it at all. Webmaster tools shows that google allows us a certain number of search impressions each day. - it flatlines, they are limiting the impressions we get. We also getthe same number of clickthroughs each day. So my question is for anyone who has this same experience, who may have experimented by deleting a page you don't care about. Did you just lose that number of clicks each day or did other pages on your site get displayed and clicked through instead?
Reporting & Analytics | | loopyal0 -
Search Engine blocked by robots.txt
I am getting this error whe I try to crawl http://photosales.belfasttelegraph.co.uk/ but my robots.txt file does not block any bots?
Reporting & Analytics | | MirandaP0 -
Get Stats on Search Box Usage, Using Google Analytics?
Let me preface this by first saying I am NOT using the Google Adsense Search feature. The question I am asking is pertaining to the standard search box that comes already coded in most .php blogs. In the Google Analytics, is there a way to tell how many people are searching using the search box? I've looked through the navigation and pathway options in analytics and I don't seem to see anything that definitively points to the number of people using the search box but, perhaps, I am overlooking it. Any insight?
Reporting & Analytics | | ChristineCadena0 -
Why do I get lots of traffic from a bizarre keyword?
Bit of an odd one but I've been getting a large and steady stream of traffic over the last few months from a very random keyword that according to addwords figures shows "on data". Its our second biggest referring term only beaten by our brand name. We get more traffic from this term than keywords we have invested a lot of time in that show thousands of traffic volume in addwords. When looking at behavioral data its gets odder, a bounce rate of 98.11% time on site 2 seconds and page visits 1.02. So this traffic isn't real traffic and it's not real people. So my questions are, what is it? why do we get this random traffic, has anyone els noticed things like this and is it a problem? I presume it must be something to do with some sort of spam but apart from that i'm stumped. It's just one of those things that has been bugging me so I would appreciate any help. Kind Regards Paul
Reporting & Analytics | | pauldoffman0