Sitemap issue - Tons of 404 errors
-
We've recreated a client site in a subdirectory (mysite.com/newsite) of his domain and when it was ready to go live, added code to the htaccess file in order to display the revamped website on the main url. These are the directions that were followed to do this: http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory and http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change. This has worked perfectly except that we are now receiving a lot of 404 errors am I'm wondering if this isn't the root of our evil.
This is a WordPress self-hosted website and we are actively using the WordPress SEO plugin that creates multiple folders with only 50 links in each. The sitemap_index.xml file tests well in Google Analytics but is pulling a number of links from the subdirectory folder.
I'm wondering if it really is the manner in which we made the site live that is our issue or if there is another problem that I cannot see yet. What is the best way to attack this issue? Any clues?
The site in question is www.atozqualityfencing.com
-
Thanks again for the awesome help. I really appreciate your time and effort!!
-
I don't think it would snowball. It should be the end of the issue, as I think google will have found all of the pages it is going to find. You might have some more popup like tags pages and thing like that, but nothing major. I don't know if your webmaster is letting you see the webmaster tools or not, but it has an error date of when it last detected the error. It should look like this, http://screencast.com/t/5a9lpC6o then you can click on the link and pull this window up, http://screencast.com/t/boyAdXGoOLl From there you can see if the links were internal or external that were triggering the 404 pages. It could very well be that external backlinks were triggering them. If they are internal links, to be safe I would search the source of the pages for the links.
Also, Moz's crawler should pick up the 404 errors and let you know if it is still because of links on the site. The 301 redirects will handle the issue if the links were from the old site, but if the links are because of internal links on the new site that are broken, I would find them and fix them with Moz's crawler or Ravens Crawler.
-
Thank you for your insight Lesley! If we do as you suggest, will that be the end of the issue or could it snowball? Wouldn't you think that if there were changes to the site after Google indexed it the next crawl by Google would correct it? Is there a way to get Google to crawl it immediately? Probably not, huh? lol
-
This one is really difficult to tell what has actually gone wrong. I am thinking there might have been changes to the site once google indexed the site for the first time and the point it is at now. I went to the internet archive and I could not see many of the pages, so I do not really know.
The fix however is to write 301 redirects for all of the pages that are pulling a 404, but there is a page that represents them. It looks like some of the pages might have had a url change and others might have been done away with.
-
Thanks for your reply, Lesley. I am checking with the developer as to which exact steps she took to make the site live from a subdirectory. Some of the 404 pages include:
http://www.atozqualityfencing.com/newsite/feed/
http://www.atozqualityfencing.com/fencing-styles/
http://www.atozqualityfencing.com/fence-materials/conact
http://www.atozqualityfencing.com/newsite/conact/
http://www.atozqualityfencing.com/faq/wood-fencing-gallery
http://www.atozqualityfencing.com/faq/vinyl-fencing-gallery
http://www.atozqualityfencing.com/faq/structures-gallery
http://www.atozqualityfencing.com/faq/horse-fencing-gallery
http://www.atozqualityfencing.com/faq/horse-shelter-gallery
http://www.atozqualityfencing.com/conact
http://www.atozqualityfencing.com/author/aaron-smith/wood-fencing-galleryThere are a total of 210 of them.
What other information can I provide to help get this figured out?
-
It is really hard to tell without seeing the errors. Are the pages at the same address as the previous pages? Did you redirect them? Is there something internally wrong that is hard to tell? It would be easier to diagnose if we could the a list of the 404 pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Discovered - currently not indexed issue
Hello all, We have a sitemap with URLs that have mostly user generated content. Profile Overview section. Where users write about their services and some other things. Out of 46K URLs, only 14K are valid according to search console and 32K URLs are excluded. Out of these 32K, 28K are "Discovered - currently not indexed". We can't really update these pages as they have user generated content. However we do want to leverage all these pages to help us in our SEO. So the question is how do we make all of these pages indexable? If anyone can help in the regard, please let me know. Thanks!
Technical SEO | | akashkandari0 -
Search Console has found over 18k 404 errors in my site, should I redirect?
most of them where old URLs pointed from a really old domain, that we have just shutten down. If the pages didn't receive any traffic, should we redirect? If I follow this https://moz.com/learn/seo/http-status-codes we shouldn't
Technical SEO | | pablo_carrara0 -
Redundant Hostnames Issue in GA
I noticed another post on this, but I have another question. I am getting this message from Analytics: Property http://www.example.com is receiving data from redundant hostnames. Consider setting up a 301 redirect on your website, or make a search and replace filter that strips "www." from hostnames. Examples of redundant hostnames: example.com, www.example.com. We don't have a 301 in place that manages this and I am quite concerned about handling that the right way. We do have a canonical on our homepage that says: rel="canonical" href="http://www.example.com/" /> I asked on another site how to safely set up our 301 and I got this response: RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
Technical SEO | | TheCraig
RewriteRule ^ http://www.example.com%{REQUEST_URI} [R=301,L,NE] Is this the best way of handling it? Are there situations where this would not be the best way? We do have a few subdomains like beta.example.com in use and have a rather large site, so I just want to make sure I get it right. Thanks for your help! Craig0 -
Issue with Cached pages
I have a client who has a three domains:
Technical SEO | | paulbaguley
budgetkits.co.uk
prosocceruk.co.uk
cheapfootballkits.co.uk Budget Kits is not active but Pro Soccer and Cheap Football Kits are. The issue is when you do site:budgetkits.co.uk on Google it brings back results. If you click on the link it goes to page saying website doesn't exist which is correct but if you click on cached it shows you a page from prosocceruk.co.uk or cheapfootballkits.co.uk. The cached pages are very recent by a couple of days ago to a week. The first result brings up www.budgetkits.co.uk/rainwear but the cached page is www.prosocceruk.co.uk/rainwear The third result brings up www.budgetkits.co.uk/kids-football-kits but the cached page is http://www.cheapfootballkits.co.uk The history of this issue is that budgetkits.co.uk was its own website 7 years ago and then it used to point at prosocceruk.co.uk after that but it no longer does for about two months. All files have been deleted from budgetkits.co.uk so it is just a domain. Any help with this would be very much appreciated as I have not seen this kind of issue before.0 -
Error 404, Wordpress adds the domain automaticly to the end of the pages, WHY?
Hello guys, I'm using wordpress and the Yoast to help me improve my SEO. Everything went well except for today because "Moz" found 404 errors when scrolling the website saying showing the domain of my website at the end of 12 url. For example :
Technical SEO | | abonnisseau
www.domain.com/service-1/www.domain.com www.domain.com/contact-page/**www.domain.com ** Do you have any idea where does that come from ? Thanks Alex0 -
I have 404 errors but can't find where these links are?
The 4xx report had 0 errors, and then on the recent crawl it found over 200. They are all variations on real URLs e.g.: Real URL: http://www.bullseyeuk.com/10-up-deluxe-literature-holder.html 404 Error URL: http://www.bullseyeuk.com/10-up-deluxe-literature-holder.html �� None of them are linked to the root domain and I can't find where they are coming from. Any ideas? Thanks Jack
Technical SEO | | JackMurphy0 -
301 redirect issues
Hi all, I'm hoping someone will be able to help me with an extermley frustrating problem with 301 redirects in .htaccess. Basically I'm trying to redirect some old pages (from our old website) that stil rank to the new equivilent. For example - old url = www.domain.com/frames/news/company-news/news-reader.php?newsStoryID=395 New www.domain.com/news/article-title I've tried the simple redirect 301 /frames/news/company-news/news-reader.php?newsStoryID=395 http://www.domain.com/news/article-title But this doesnt work. I've also tried - RewriteEngine on
Technical SEO | | EclipseLegal
RewriteCond %{QUERY_STRING} ^newsStoryID=395$
RewriteRule ^/news-reader.php$ http://www.domain.com/news/article-title/? [L,R=301] Could anyone help? I've followed lots of tutorials that all match the above but it just doesn't work! The only other thing within the htaccess file is from wordpress for pretty permalinks - BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress Many thanks in advance!0 -
Domain restructure, sitemaps and indexing
I've got a handcoded site with around 1500 unique articles and a handcoded sitemap. Very old school. The url structure is a bit of a mess, so to make things easier for a developer who'll be making the site database-driven, I thought I'd recategorise the content. Same content, but with new url structure (I thought I'd juice up the urls for SEO purposes while I was at it) To this end, I took categories like: /body/amazing-big-shoes/
Technical SEO | | magdaknight
/style/red-boots/
/technology/cyber-boots/ And rehoused all the content like so, doing it all manually with ftp: /boots/amazing-boots/
/boots/red-boots/
/boots/cyber-boots/ I placed 301 redirects in the .htaccess file like so: redirect 301 /body/amazing-boots/ http://www.site.co.uk/boots/amazing-boots/ (not doing redirects for each article, just for categories which seemed to make the articles redirect nicely.) Then I went into sitemap.xml and manually overwrote all the entries to reflect the new url structure, but keeping the old dates of the original entries, like so: <url><loc>http://www.site.co.uk/boots/amazing-boots/index.php</loc>
<lastmod>2008-07-08</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority></url> And resubmitted the sitemap to Google Webmasters. This was done 4 days ago. Webmaster said that the 1400 of 1500 articles indexed had dropped to 860, and today it's climbed to 939. Did I adopt correct procedure? Am I going about things the right way? Given a little time, can I expect Google to re-index the new pages nicely? I appreciate I've made a lot of changes in one fell swoop which could be a bit of a no-no... ? PS Apologies if this question appears twice on Q&A - hopefully I haven't double-posted0