Sitemap issue - Tons of 404 errors
-
We've recreated a client site in a subdirectory (mysite.com/newsite) of his domain and when it was ready to go live, added code to the htaccess file in order to display the revamped website on the main url. These are the directions that were followed to do this: http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory and http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change. This has worked perfectly except that we are now receiving a lot of 404 errors am I'm wondering if this isn't the root of our evil.
This is a WordPress self-hosted website and we are actively using the WordPress SEO plugin that creates multiple folders with only 50 links in each. The sitemap_index.xml file tests well in Google Analytics but is pulling a number of links from the subdirectory folder.
I'm wondering if it really is the manner in which we made the site live that is our issue or if there is another problem that I cannot see yet. What is the best way to attack this issue? Any clues?
The site in question is www.atozqualityfencing.com
-
Thanks again for the awesome help. I really appreciate your time and effort!!
-
I don't think it would snowball. It should be the end of the issue, as I think google will have found all of the pages it is going to find. You might have some more popup like tags pages and thing like that, but nothing major. I don't know if your webmaster is letting you see the webmaster tools or not, but it has an error date of when it last detected the error. It should look like this, http://screencast.com/t/5a9lpC6o then you can click on the link and pull this window up, http://screencast.com/t/boyAdXGoOLl From there you can see if the links were internal or external that were triggering the 404 pages. It could very well be that external backlinks were triggering them. If they are internal links, to be safe I would search the source of the pages for the links.
Also, Moz's crawler should pick up the 404 errors and let you know if it is still because of links on the site. The 301 redirects will handle the issue if the links were from the old site, but if the links are because of internal links on the new site that are broken, I would find them and fix them with Moz's crawler or Ravens Crawler.
-
Thank you for your insight Lesley! If we do as you suggest, will that be the end of the issue or could it snowball? Wouldn't you think that if there were changes to the site after Google indexed it the next crawl by Google would correct it? Is there a way to get Google to crawl it immediately? Probably not, huh? lol
-
This one is really difficult to tell what has actually gone wrong. I am thinking there might have been changes to the site once google indexed the site for the first time and the point it is at now. I went to the internet archive and I could not see many of the pages, so I do not really know.
The fix however is to write 301 redirects for all of the pages that are pulling a 404, but there is a page that represents them. It looks like some of the pages might have had a url change and others might have been done away with.
-
Thanks for your reply, Lesley. I am checking with the developer as to which exact steps she took to make the site live from a subdirectory. Some of the 404 pages include:
http://www.atozqualityfencing.com/newsite/feed/
http://www.atozqualityfencing.com/fencing-styles/
http://www.atozqualityfencing.com/fence-materials/conact
http://www.atozqualityfencing.com/newsite/conact/
http://www.atozqualityfencing.com/faq/wood-fencing-gallery
http://www.atozqualityfencing.com/faq/vinyl-fencing-gallery
http://www.atozqualityfencing.com/faq/structures-gallery
http://www.atozqualityfencing.com/faq/horse-fencing-gallery
http://www.atozqualityfencing.com/faq/horse-shelter-gallery
http://www.atozqualityfencing.com/conact
http://www.atozqualityfencing.com/author/aaron-smith/wood-fencing-galleryThere are a total of 210 of them.
What other information can I provide to help get this figured out?
-
It is really hard to tell without seeing the errors. Are the pages at the same address as the previous pages? Did you redirect them? Is there something internally wrong that is hard to tell? It would be easier to diagnose if we could the a list of the 404 pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hundreds of 404 errors are showing up for pages that never existed
For our site, Google is suddenly reporting hundreds of 404 errors, but the pages they are reporting never existed. The links Google shows are clearly spam style, but the website hasn't been hacked. This happened a few weeks ago, and after a couple days they disappeared from WMT. What's the deal? Screen-Shot-2016-02-29-at-9.35.18-AM.png
Technical SEO | | MichaelGregory0 -
Why is Google Webmaster Tools showing 404 Page Not Found Errors for web pages that don't have anything to do with my site?
I am currently working on a small site with approx 50 web pages. In the crawl error section in WMT Google has highlighted over 10,000 page not found errors for pages that have nothing to do with my site. Anyone come across this before?
Technical SEO | | Pete40 -
Duplicate Content Issue
My issue with duplicate content is this. There are two versions of my website showing up http://www.example.com/ http://example.com/ What are the best practices for fixing this? Thanks!
Technical SEO | | OOMDODigital0 -
404 error due to a page which requires a login
what do I do with 404 errors reported in webmaster tools that are actually URLs where users are clicking a link that requires them to log in (so they get sent to a login page). what's the best practice in these cases? Thanks in advance!
Technical SEO | | joshuakrafchin0 -
Noob 101 - Sitemaps
Hi guys, looking for some sitemap help. I'm running two seperate systems so my auto-generated sitemap on the main system has a few holes in it. I'd like to submit this to webmaster anyway, and then plug the holes with missing pages by adding them to 'Fetch as Google'. Does that make sense or will Google ignore one of them? Many thanks, Idiot
Technical SEO | | uSwSEO0 -
Seek help correcting large number of 404 errors generated, 95% traffic halt
Hi, The following GWT screen tells a bit of the story: site: http://bit.ly/mrgdD0 http://www.diigo.com/item/image/1dbpl/wrbp On about Feb 8 I decided to fix a large number of 'duplicate title' warnings being reported in GWT "HTML Suggestions" -- these were for URLs which differed only in parameter case, and which had Canonical tags, but were still reported as dups in GWT. My traffic had been steady at about 1000 clicks/day. At midnight on 2/10, google traffic completely halted, down to 11 clicks/day. I submitted a recon request and was told 'no manual penalty' Also, the 'sitemap' indexes in GWT showed 'pending' for 24x7 starting then. By about the 18th, the 'duplicate titles' count dropped to about 600 or so... the next day traffic hopped right back to about 800 clicks/day - for a week - then stopped again, down to 10/day, a week later, on the 26th. I then noticed that GWT was reporting 20K page-not found errors - this has now grown to 35K such errors! I realized that bogus internal links were being generated as I failed to disable the PHP warning messages.... so I disabled PHP warnings and fixed what I thought was the source of the errors. However, the not-found count continues to climb -- and I don't know where these bad internal links are coming from, because the GWT report lists these link sources as 'unavailable'. I'v been through a similar problem last year and it took months (4) for google to digest all the bogus pages ad recover. If I have to wait that long again I will lose much $$. Assuming that the large number of 404 internal errors is the reason for the sudden shutoff... How can I a) verify the source of these internal links, given that google says the source pages are 'unavailable'.. Most critically, how can I do a 'RESET" and have google re-spider my site -- or block the signature of these URLs in order to get rid of these errors ASAP?? thanks
Technical SEO | | mantucket0 -
Htaccess issue
I have some urls in my site due to a rating counter. These are like: domain.com/?score=4&rew=25
Technical SEO | | sesertin
domain.com/?score=1&rew=28
domain.com/?score=5&rew=95 These are all duplicate content to my homepage and I want to 301 redirect them there. I tried so far: RedirectMatch 301 /[a-z]score[a-z] http://domain.com
RedirectMatch 301 /.score. http://domain.com
RedirectMatch 301 /^score$.* http://domain.com
RedirectMatch 301 /.^score$.* http://domain.com
RedirectMatch 301 /[a-z]score[a-z] http://domain.com
RedirectMatch 301 score http://domain.com
RedirectMatch 301 /[.]score[.] http://domain.com
RedirectMatch 301 /[.]score[.] http://domain.com
RedirectMatch 301 /[a-z,0-9]score[a-z,0-9] http://domain.com
RedirectMatch 301 /[a-z,0-9,=,&]score[a-z,0-9,=,&] http://domain.com
RedirectMatch 301 /[a-z,0-9,=&?/.]score[a-z,0-9,=&] http://domain.com None of them works. Anybody? Solution? Would be very much appriciated0 -
Duplicate Pages Issue
I noticed a problem and I was wondering if anyone knows how to fix it. I was a sitemap for 1oxygen.com, a site that has around 50 pages. The sitemap generator come back with over a 2000 pages. Here is two of the results: http://www.1oxygen.com/portableconcentrators/portableconcentrators/portableconcentrators/services/rentals.htm
Technical SEO | | chuck-layton
http://www.1oxygen.com/portableconcentrators/portableconcentrators/1oxygen/portableconcentrators/portableconcentrators/portableconcentrators/oxusportableconcentrator.htm These are actaully pages somehow. In my FTP there in the first /portableconentrators/ folder there is about 12 html documents and no other folders. It looks like it is creating a page for every possible folder combination. I have no idea why you those pages above actually work, help please???0