Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Googlebot HTTP 204 Status Code Handling?
-
If a user runs a search that returns no results, and the server returns a 204 (No Content), will Googlebot treat that as the rough equivalent of a 404 or a noindex? If not, then it seems one would want to noindex the page to avoid low quality penalties, but that might require more back and forth with the server, which isn't ideal.
Kurus
-
Thanks for your input.
-
I believe Google handles 204 codes the same as 200. They index a page with basically no content. Unless someone links to a 204 page however, Google will never see one by your example. Google is not out and about running searches on websites to see what comes up to find more content to index. If someone were to search on your site and get a 204, then link to it, then yeah, Google could crawl and index it. In that case though you might see it in your webmaster tools under crawl errors. Then you could noindex it or block it with robots.txt or something else.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to handle sorting, filtering, and pagination in ecommerce? Canonical is enough?
Hello, after reading various articles and watching several videos I'm still not sure how to handle faceted navigation (sorting/filtering) and pagination on my ecommerce site. Current indexation status: The number of "real" pages (from my sitemap) - 2.000 pages Google Search Console (Valid) - 8.000 pages Google Search Console (Excluded) - 44.000 pages Additional info: Vast majority of those 50k additional pages (44 + 8 - 2) are pages created by sorting, filtering and pagination. Example of how the URL changes while applying filters/sorting: example.com/category --> example.com/category/1/default/1/pricefrom/100 Every additional page is canonicalized properly, yet as you can see 6k is still indexed. When I enter site:example.com/category in Google it returns at least several results (in most of the cases the main page is on the 1st position). In Google Analytics I can see than ~1.5% of Google traffic comes to the sorted/filtered pages. The number of pages indexed daily (from GSC stats) - 3.000 And so I have a few questions: Is it ok to have those additional pages indexed or will the "real" pages rank higher if those additional would not be indexed? If it's better not to have them indexed should I add "noindex" to sorting/filtering links or add eg. Disallow: /default/ in robots.txt? Or perhaps add "noindex, nofollow" to the links? Google would have then 50k pages less to crawl but perhaps it'd somehow impact my rankings in a negative way? As sorting/filtering is not based on URL parameters I can't add it in GSC. Is there another way of doing that for this filtering/sorting url structure? Thanks in advance, Andrew
Intermediate & Advanced SEO | | thpchlk0 -
We are redirecting http and non www versions of our website. Should all versions http (non www version and www version) and https (non www version) should just have 1 redirect to the https www version?
We are redirecting http and non www versions of our website. Should all versions http (non www version and www version) and https (non www version) should just have 1 redirect to the https www version? Thant way all forms of the website are pointing to one version?
Intermediate & Advanced SEO | | Caffeine_Marketing0 -
Would you rate-control Googlebot? How much crawling is too much crawling?
One of our sites is very large - over 500M pages. Google has indexed 1/8th of the site - and they tend to crawl between 800k and 1M pages per day. A few times a year, Google will significantly increase their crawl rate - overnight hitting 2M pages per day or more. This creates big problems for us, because at 1M pages per day Google is consuming 70% of our API capacity, and the API overall is at 90% capacity. At 2M pages per day, 20% of our page requests are 500 errors. I've lobbied for an investment / overhaul of the API configuration to allow for more Google bandwidth without compromising user experience. My tech team counters that it's a wasted investment - as Google will crawl to our capacity whatever that capacity is. Questions to Enterprise SEOs: *Is there any validity to the tech team's claim? I thought Google's crawl rate was based on a combination of PageRank and the frequency of page updates. This indicates there is some upper limit - which we perhaps haven't reached - but which would stabilize once reached. *We've asked Google to rate-limit our crawl rate in the past. Is that harmful? I've always looked at a robust crawl rate as a good problem to have. Is 1.5M Googlebot API calls a day desirable, or something any reasonable Enterprise SEO would seek to throttle back? *What about setting a longer refresh rate in the sitemaps? Would that reduce the daily crawl demand? We could set increase it to a month, but at 500M pages Google could still have a ball at the 2M pages/day rate. Thanks
Intermediate & Advanced SEO | | lzhao0 -
How to handle a blog subdomain on the main sitemap and robots file?
Hi, I have some confusion about how our blog subdomain is handled in our sitemap. We have our main website, example.com, and our blog, blog.example.com. Should we list the blog subdomain URL in our main sitemap? In other words, is listing a subdomain allowed in the root sitemap? What does the final structure look like in terms of the sitemap and robots file? Specifically: **example.com/sitemap.xml ** would I include a link to our blog subdomain (blog.example.com)? example.com/robots.xml would I include a link to BOTH our main sitemap and blog sitemap? blog.example.com/sitemap.xml would I include a link to our main website URL (even though it's not a subdomain)? blog.example.com/robots.xml does a subdomain need its own robots file? I'm a technical SEO and understand the mechanics of much of on-page SEO.... but for some reason I never found an answer to this specific question and I am wondering how the pros do it. I appreciate your help with this.
Intermediate & Advanced SEO | | seo.owl0 -
Can an incorrect 301 redirect or .htaccess code cause 500 errors?
Google Webmaster Tools is showing the following message: _Googlebot couldn't access the contents of this URL because the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request. _ Before I contact the person who manages the server and hosting (essentially asking if the error is on his end) is there a chance I could have created an issue with an incorrect 301 redirect or other code added to .htaccess incorrectly? Here is the 301 redirect code I am using in .htaccess: RewriteEngine On RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/.]+/)*(index.html|default.asp)\ HTTP/ RewriteRule ^(([^/.]+/)*)(index|default) http://www.example.com/$1 [R=301,L] RewriteCond %{HTTP_HOST} !^(www.example.com)?$ [NC] RewriteRule (.*) http://www.example.com/$1 [R=301,L] Could adding the following code after that in the .htaccess potentially cause any issues? BEGIN EXPIRES <ifmodule mod_expires.c="">ExpiresActive On
Intermediate & Advanced SEO | | kimmiedawn
ExpiresDefault "access plus 10 days"
ExpiresByType text/css "access plus 1 week"
ExpiresByType text/plain "access plus 1 month"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType application/x-javascript "access plus 1 month"
ExpiresByType application/javascript "access plus 1 week"
ExpiresByType application/x-icon "access plus 1 year"</ifmodule> END EXPIRES (Edit) I'd like to add that there is a Wordpress blog on the site too at www.example.com/blog with the following code in it's .htaccess: BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /blog/
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /blog/index.php [L]</ifmodule> END WordPress Thanks0 -
One company, two address. How do I handle footer NAP?
I have a client with two address that fall under the same brand. One address is in CA and the other is in NY. I have a single domain and will be creating separate landing pages for each location but wanted to know how I should handle the NAP in the footer of the other pages. Should I list both NAPs, one NAP or neither NAPs in the footer? Thanks in advance for your help.
Intermediate & Advanced SEO | | DigitalWorkboots0 -
Effects of having both http and https on my website
You are able to view our website as either http and https on all pages. For example: You can type "http://mywebsite.com/index.html" and the site will remain as http: as you navigate the site. You can also type "https://mywebsite.com/index.html" and the site will remain as https: as you navigate the site. My question is....if you can view the entire site using either http or https, is this being seen as duplicate content/pages? Does the same hold true with "www.mywebsite.com" and "mywebsite.com"? Thanks!
Intermediate & Advanced SEO | | rexjoec1 -
Robots.txt is blocking Wordpress Pages from Googlebot?
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
Intermediate & Advanced SEO | | ENSO0