Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Some URLs were not accessible to Googlebot due to an HTTP status error.
-
Hello I'm a seo newbie and some help from the community here would be greatly appreciated.
I have submitted the sitemap of my website in google webmasters tools and now I got this warning:
"When we tested a sample of the URLs from your Sitemap, we found that some URLs were not accessible to Googlebot due to an HTTP status error. All accessible URLs will still be submitted."
How do I fix this? What should I do?
Many thanks in advance.
-
You need to confirm that the URLs are in fact 100% of your URLs going into the site map are accessible.
if it's a big issue in a big site send me the URL in a private message I will use deep crawl to create a XML sitemap for you. The screaming frog tool is excellent as well though does performance well with extremely large sites.
check your robots.txt file this so great tool if in case you have more than one (it happens)
http://www.internetmarketingninjas.com/seo-tools/robots-txt-generator/
or
http://tools.seochat.com/tools/robots-txt-validator/
so many great free tools are found right here http://tools.seochat.com/tools/
It could be a number of things although it could be Google being finicky. Run the site through Moz crawler, use feedthebot.com using "tools SEO" or download the free version of http://www.screamingfrog.co.uk/seo-spider/ this will tell you if there is an issue. If your site is static you can even create an alternate site map with screaming frog if your site is large use deep crawl or Moz analytics
be certain there are no sitemaps redirecting to each other so no redirects going from the old site map to the new site map. Make certain that the site map is in an XML format e.g. http://example.com/sitemap.xml or if in a different format like https://example.com/sitemap_index.xml make sure the proper format That resolves when you look at the site map is what is going into Webmaster tools. Be certain the site map does not contain over 500 URLs per the site map so example.com/sitemap1.xml and so on keep numbering them appropriately. sometimes Google is overloaded and does not seem to like to play well with certain site maps or the site map is maybe not generating very well on the server and that is fixed later on. If this is a long-term problem speak to your host or developer. My recommendation is if you've done everything I have talked about that you attempt to submit is the sitemap to to Webmaster tools or simply build a new sitemap and submit that.
so if worse comes to worse take the screaming frog and use this URL to send it to Google
http://www.google.com/submityourcontent/business-owner/
I hope that helps,
Thomas
-
Hi, It looks like you have url's placed in your sitemap that have an HTTP status error. You can search for the urls and remove them from your sitemap or make sure they have the right status. Does it say which status error? And does it say which url's? Did you check those url's?When you use Screaming frog spider tool (free), you can search for status error's this is an easy way to find these url's.
Grtz, Leonie
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO - New URL structure
Hi, Currently we have the following url structure for all pages, regardless of the hierarchy: domain.co.uk/page, such as domain/blog name. Can you, please confirm the following: 1. What is the benefit of organising the pages as a hierarchy, i.e. domain/features/feature-name or domain/industries/industry-name or domain/blog/blog name etc. 2. This will create too many 301s - what is Google's tolerance of redirects? Is it worth for us changing the url structure or would you only recommend to add breadcrumbs? Many thanks Katarina
Technical SEO | | Katarina-Borovska1 -
Folders in url structure?
Hello, Revamping an out-of-date website and am wondering if I need to include the folders (categories) in the url structure? The proposed structure has 8 main folders. I've been reading that Google is ok if the folder is not included in the url, but is it really? The hesitation I have is that the urls are getting long and the main folder only has only a sub folder beneath it. So, /folder-name/facility-name/treatment-overview. This looks too long, doesn't it? Thanks!
Technical SEO | | lfrazer1230 -
Category URL Pagination where URLs don't change between pages
Hello, I am working on an e-commerce site where there are categories with multiple pages. In order to avoid pagination issues I was thinking of using rel=next and rel=prev and cannonical tags. I noticed a site where the URL doesn't change between pages, so whether you're on page 1,2, or 3 of the same category, the URL doesn't change. Would this be a cleaner way of dealing with pagination?
Technical SEO | | whiteonlySEO0 -
Tool to Generate All the URLs on a Domain
Hi all, I've been using xml-sitemaps.com for a while to generate a list of all the URLs that exist on a domain. However, this tool only works for websites with under 500 URLs on a domain. The paid tool doesn't offer what we are looking for either. I'm hoping someone can help with a recommendation. We're looking for a tool that can: Crawl, and list, all the indexed URLs on a domain, including .pdf and .doc files (ideally in a .xls or .txt file) Crawl multiple domains with unlimited URLs (we have 5 websites with 500+ URLs on them) Seems pretty simple, but we haven't been able to find something that isn't tailored toward management of a single domain or that can crawl a huge volume of content.
Technical SEO | | timfrick0 -
Url folder structure
I work for a travel site and we have pages for properties in destinations and am trying to decide how best to organize the URLs basically we have our main domain, resort pages and we'll also have articles about each resort so the URL structure will actually get longer:
Technical SEO | | Vacatia_SEO
A. domain.com/main-keyword/state/city-region/resort-name
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature _
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village/kid-friend-pool_ B. Another way to structure would be to remove the location and keyword folders and combine. Note that some of the resort names are long and spaces are being replaced dynamically with dashes.
ex. domain.com/main-keyword-in-state-city/resort-name
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature_
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village-kid-friend-pool_ Question: is that too many folders or should i combine or break up? What would you do with this? Trying to avoid too many dashes.0 -
Removing URL Parentheses in HTACCESS
Im reworking a website for a client, and their current URLs have parentheses. I'd like to get rid of these, but individual 301 redirects in htaccess is not practical, since the parentheses are located in many URLs. Does anyone know an HTACCESS rule that will simply remove URL parantheses as a 301 redirect?
Technical SEO | | JaredMumford0 -
Special characters in URL
Hi There, We're in the process of changing our URL structure to be more SEO friendly. Right now I'm struggling to find a good way to handle slashes that are part of a targeted keyword. For example, if I have a product page and my product title is "1/2 ct Diamond Earrings in 14K Gold" which of the following URLs is the right way to go if I'm targeting the product title as the search keyword? example.com/jewelry/1-2-ct-diamond-earrings-in-14k-gold example.com/jewelry/12-ct-diamond-earrings-in-14k-gold example.com/jewelry/1_2-ct-diamond-earrings-in-14k-gold example.com/jewelry/1%2F2-ct-diamond-earrings-in-14k-gold Thanks!
Technical SEO | | Richline_Digital0 -
403 forbidden error website
Hi Mozzers, I got a question about new website from a new costumer http://www.eindexamensite.nl/. There is a 403 forbidden error on it, and I can't find what the problem is. I have checked on: http://gsitecrawler.com/tools/Server-Status.aspx
Technical SEO | | MaartenvandenBos
result:
URL=http://www.eindexamensite.nl/ **Result code: 403 (Forbidden / Forbidden)** When I delete the .htaccess from the server there is a 200 OK :-). So it is in the .htaccess. .htaccess code: ErrorDocument 404 /error.html RewriteEngine On
RewriteRule ^home$ / [L]
RewriteRule ^typo3$ - [L]
RewriteRule ^typo3/.$ - [L]
RewriteRule ^uploads/.$ - [L]
RewriteRule ^fileadmin/.$ - [L]
RewriteRule ^typo3conf/.$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule .* index.php Start rewrites for Static file caching RewriteRule ^(typo3|typo3temp|typo3conf|t3lib|tslib|fileadmin|uploads|screens|showpic.php)/ - [L]
RewriteRule ^home$ / [L] Don't pull *.xml, *.css etc. from the cache RewriteCond %{REQUEST_FILENAME} !^..xml$
RewriteCond %{REQUEST_FILENAME} !^..css$
RewriteCond %{REQUEST_FILENAME} !^.*.php$ Check for Ctrl Shift reload RewriteCond %{HTTP:Pragma} !no-cache
RewriteCond %{HTTP:Cache-Control} !no-cache NO backend user is logged in. RewriteCond %{HTTP_COOKIE} !be_typo_user [NC] NO frontend user is logged in. RewriteCond %{HTTP_COOKIE} !nc_staticfilecache [NC] We only redirect GET requests RewriteCond %{REQUEST_METHOD} GET We only redirect URI's without query strings RewriteCond %{QUERY_STRING} ^$ We only redirect if a cache file actually exists RewriteCond %{DOCUMENT_ROOT}/typo3temp/tx_ncstaticfilecache/%{HTTP_HOST}/%{REQUEST_URI}/index.html -f
RewriteRule .* typo3temp/tx_ncstaticfilecache/%{HTTP_HOST}/%{REQUEST_URI}/index.html [L] End static file caching DirectoryIndex index.html CMS is typo3. any ideas? Thanks!
Maarten0