Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Better to 301 or de-index 403 pages
-
Google WMT recently found and called out a large number of old unpublished pages as access denied errors. The pages are tagged "noindex, follow." These old pages are in Google's index.
At this point, would it better to 301 all these pages or submit an index removal request or what? Thanks... Darcy
-
Sounds solid. Thanks, Dirk!
-
The main reason why errors are listed is that you can solve them (if necessary). If these are old pages that don't have existing links on your pages - you can just forget about these warnings. However, if these warnings appear because actual pages are linking to non-existing pages this will lead to a degraded user experience and user experience is a factor which counts for SEO.
If you look at the 403 errors - normally WMT lists how the bot got to these pages. If the pages that are linking to this 403 pages are still on your site, you have to remove these links.
If you have dropped in traffic, you could try to do a full crawl of your site using screaming frog of Xenu, to do a quick check-up of the technical health of your site.
If you still have an old sitemap, or the most popular pages in Google Analytics from the period before migration, you could also use these url's as input for Screamingfrog - and check if all pages were properly redirected. If errors pop-up, these would be the ones I would redirect. I understood from your initial question that the 403's where coming from very old pages which were never meant to be accessible.
rgds
Dirk
-
Hi Dirk,
Thanks for the message. You may be right. Thing is, GWT's discovery of this large number of now blocked pages (previously indexed) seems to have coincided with a big drop in search overall.
I guess the part that I wonder about it is, if these now blocked pages as 403s are no problem and Google will just figure it out, why does it bother to list them in errors... just in case you didn't know, but that it doesn't in fact care one way or the other search-wise and it won't affect your other pages? Just wondering. Thanks... Darcy
-
It's not really necessary to 301 these pages - a 403 status code informs Google that the access is denied (Literally: The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated.)
Normally these pages will disappear from WMT after a while. If you find these 403 annoying in your WMT reports, you can always 301 them - but this isn't strictly necessary.
Removal tool - Google's advice is not to use the tool "to clean up cruft, like old pages that 404" (source: https://support.google.com/webmasters/answer/1269119?hl=en).
rgds
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
React.js Single Page Application Not Indexing
We recently launched our website that uses React.js and we haven't been able to get any of the pages indexed. Our previous site (which had a .ca domain) ranked #1 in the 4 cities we had pages and we redirected it to the .com domain a little over a month ago. We have recently started using prerender.io but still haven't seen any success. Has anyone dealt with a similar issue before?
Intermediate & Advanced SEO | | m_van0 -
Moving html site to wordpress and 301 redirect from index.htm to index.php or just www.example.com
I found page duplicate content when using Moz crawl tool, see below. http://www.example.com
Intermediate & Advanced SEO | | gozmoz
Page Authority 40
Linking Root Domains 31
External Link Count 138
Internal Link Count 18
Status Code 200
1 duplicate http://www.example.com/index.htm
Page Authority 19
Linking Root Domains 1
External Link Count 0
Internal Link Count 15
Status Code 200
1 duplicate I have recently transfered my old html site to wordpress.
To keep the urls the same I am using a plugin which appends .htm at the end of each page. My old site home page was index.htm. I have created index.htm in wordpress as well but now there is a conflict of duplicate content. I am using latest post as my home page which is index.php Question 1.
Should I also use redirect 301 im htaccess file to transfer index.htm page authority (19) to www.example.com If yes, do I use
Redirect 301 /index.htm http://www.example.com/index.php
or
Redirect 301 /index.htm http://www.example.com Question 2
Should I change my "Home" menu link to http://www.example.com instead of http://www.example.com/index.htm that would fix the duplicate content, as indx.htm does not exist anymore. Is there a better option? Thanks0 -
How to check if the page is indexable for SEs?
Hi, I'm building the extension for Chrome, which should show me the status of the indexability of the page I'm on. So, I need to know all the methods to check if the page has the potential to be crawled and indexed by a Search Engines. I've come up with a few methods: Check the URL in robots.txt file (if it's not disallowed) Check page metas (if there are not noindex meta) Check if page is the same for unregistered users (for those pages only available for registered users of the site) Are there any more methods to check if a particular page is indexable (or not closed for indexation) by Search Engines? Thanks in advance!
Intermediate & Advanced SEO | | boostaman0 -
Google indexed wrong pages of my website.
When I google site:www.ayurjeewan.com, after 8 pages, google shows Slider and shop pages. Which I don't want to be indexed. How can I get rid of these pages?
Intermediate & Advanced SEO | | bondhoward0 -
Should we 301 redirect old events pages on a website?
We have a client that has an events category section that is filled to the brim with past events webpages. Another issue is that these old events webpages all contain duplicate meta description tags, so we are concerned that Google might be penalizing our client's website for this issue. Our client does not want to create specialized meta description tags for these old events pages. Would it be a good idea to 301 redirect these old events landing pages to the main events category page to pass off link equity & remove the duplicate meta description tag issue? This seems drastic (we even noticed that searchmarketingexpo.com is keeping their old events pages). However it seems like these old events webpages offer little value to our website visitors. Any feedback would be much appreciated.
Intermediate & Advanced SEO | | RosemaryB0 -
Links from non-indexed pages
Whilst looking for link opportunities, I have noticed that the website has a few profiles from suppliers or accredited organisations. However, a search form is required to access these pages and when I type cache:"webpage.com" the page is showing up as non-indexed. These are good websites, not spammy directory sites, but is it worth trying to get Google to index the pages? If so, what is the best method to use?
Intermediate & Advanced SEO | | maxweb0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Dynamic pages - ecommerce product pages
Hi guys, Before I dive into my question, let me give you some background.. I manage an ecommerce site and we're got thousands of product pages. The pages contain dynamic blocks and information in these blocks are fed by another system. So in a nutshell, our product team enters the data in a software and boom, the information is generated in these page blocks. But that's not all, these pages then redirect to a duplicate version with a custom URL. This is cached and this is what the end user sees. This was done to speed up load, rather than the system generate a dynamic page on the fly, the cache page is loaded and the user sees it super fast. Another benefit happened as well, after going live with the cached pages, they started getting indexed and ranking in Google. The problem is that, the redirect to the duplicate cached page isn't a permanent one, it's a meta refresh, a 302 that happens in a second. So yeah, I've got 302s kicking about. The development team can set up 301 but then there won't be any caching, pages will just load dynamically. Google records pages that are cached but does it cache a dynamic page though? Without a cached page, I'm wondering if I would drop in traffic. The view source might just show a list of dynamic blocks, no content! How would you tackle this? I've already setup canonical tags on the cached pages but removing cache.. Thanks
Intermediate & Advanced SEO | | Bio-RadAbs0