Repeated mysterious 404's from ancient site structure killing my rankings
-
Several years ago I changed my site structure to go from a flash based site to a blog based wordpress site. After doing so I went from page 1 to page 30 for my relevant search terms. I have employed people to help me track down the problem and I believe that they have narroed it to the existance of 404's being created from some unknown internal source. I have been for years getting links like this...
<colgroup><col width="792"></colgroup>
||
......regularly showing in webmaster tools, (this is from a top pages report from MOZ where there are hundreds also shown).
When I do a moz crawl of the site, none of these links show up. Therefore I have no way of finding the source of these links (they also do not show me the source in WMT as they should).
We have completely cleared the site and rebuilt it and although it is still only a couple of weeks in it still does not appear to have stopped them.
Does anyone have any way of helping me find the source of these mysterious 404's?
-
Why bother trying to clean anything up? If somewhere out there there are links to your domain, and they're 404'ing, just 301 them to new pages on your site! Capture that link juice, don't let it run out
-
Thanks for your reply EEE3
The ancient link says it is linked from another non existent ancient page that no longer exists and it is always first crawled and last detected on the day that it arrives.
eg. last crawled 4/23/14, first detected 4/23/14
http://www.dfphotographer.com.au/brisbaneweddingphotographer/2011/03/st-kilda-wedding......
linked from
http://dfphotographer.com.au/brisbaneweddingphotographer/index.php/2011/03/st-kilda-wedding.....
and
http://dfphotographer.com.au/brisbaneweddingphotographer/2011/03/st-kilda-wedding....
-
Thanks for your response Keri,
Being staff can you please tell me where does the top pages data come from? Is it from crawling my site (like a google spider) or is it sourced from google or somewhere else. How often is that data refreshed?
In answer to your response, I have tried both screaming frog and xenu and my nice clean site structure is all it picks up. None of the ancient messy site structure appears.
Have been through the list of domains looking for an old sitemap or something similar that may have been scraped off my site but after a long and arduous task could not locate any reference to any of these links that show up in top pages and webmaster tools (which says they are linked from other ancient pages - which I will expand on below)
We have looked at all the usual suspects - old sitemaps, plugins and rebuilt the site just in case we missed anything that was lingering around. I have had really good people looking at it who continue to do so it just never seems to go away.
-
In Webmaster Tools, when you click on the 404 and the popup window appears, what is showing in the Linked from tab?
-
I edited the post so the URLs didn't run together. Still not perfect, but a little easier to read.
I'm not exactly sure where those links are coming from. You might run a tool like Xenu Link Sleuth or Screaming Frog on your site to see if there is an internal linking widget gone awry. The other thought I have is to look at Open Site Explorer to see what sites are linking to you and if they're linking to any of those pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz was unable to crawl your site on Jun 22, 2020\. We were unable to access your site due to a page timeout on your robots.txt, which prevented us from crawling the rest of your site.
Site: www.kpmg.us Getting robots.txt timeout fail since 02/29/20. We've checked our server logs and see no errors. Went through all the steps of the "Troubleshooter". Updated robots.txt to allow rogerbot full access: User-agent: rogerbot
Link Explorer | | KPMG-Search-Social
Disallow: Any ideas how to get roger to crawl my site????1 -
WP Events Calendar Creates URLs Too Long in Site Crawler
My travel/tourism site is on WP and using an Events plugin that ads a calendar of events to many pages. The MOZ crawler is indexing almost 46K links with a URL too long, but the site only has about 3.8K pages indexed in Google. I can tell MOZ is indexing the same pages over and over again but just adding a random calendar month and year. Here are some examples. https://www.visitcurrituck.com/four-day-stay/?full=1&long_events=1&country[0]=US&ajaxCalendar=1&mo=10&yr=2003 https://www.visitcurrituck.com/four-day-stay/?full=1&long_events=1&country%5B0%5D=US&ajaxCalendar=1&mo=10&yr=2034 https://www.visitcurrituck.com/beach-houses-family-time/?full=1&long_events=1&country%5B0%5D=US&ajaxCalendar=1&mo=1&yr=1873 Any advice on how to prevent MOZ from indexing this way? I don't believe that Google is seeing this also, but maybe they are. I just know my site has over 63K issues and I'm sure at least 75% or more is because of the way they are picking up on the events calendar. Thanks!
Link Explorer | | CinivaAgency1 -
How long will it take for the changes we've made to reflect in Moz OSE spam score data?
I signed up for Moz to see the spam flags our site had triggered. As soon as I found out, we worked on it and have been trying to correct our mistakes but it's been more than a month and we've managed to neutralise zero flags. I would appreciate if someone can clarify how long the OSE data takes to refresh. Also, how do you combat the following three specific flags: Ratio of Followed to Nofollowed Subdomains Ratio of Followed to Nofollowed Domains Low Number of Pages Found Crawl only gets a valid response to a small number of pages. Thanks.
Link Explorer | | Oziti0 -
Site Mark-up is Abnormally Small
My site www.brightonsoundsystem.co.uk has been optimised for speed so I have minimised the code needed. Now if I put it through the OSE spam analysis it has a flag for "Site Mark-up is Abnormally Small". What ratio of visible text compared to mark-up code is being used to trigger this flag. Also as this is the only flag I have is ti worth the time fixing.
Link Explorer | | Brighton-Soundsystem0 -
Why does Open site explorer only show a fraction of the linked Domains that Google does?
One of my sites ( http://www.georgerossphotography.com ) shows 19 root domains and 155 links on Open Site Explorer. This doesn't pass the sniff test with me and when I look at Google Webmasters they show 120 domains with 1,870 links. I am assuming that Google use it's own data for ranking purposes which makes me question the validity of Open Site Explorer? Am I missing something ? I have been using OSE as my primary tool and now I feel that it has little value. I would appreciate any feedback. George.
Link Explorer | | sirgeorge0 -
Is there some way to tell the Moz crawler not to crawl URL's with particular dynamic tags such as "?redirect-to:http//" ?
We are encountering an issue where the crawler is finding a ton of pages from our wordpress login url that has this dynamic tag in it to kinds of different blog entries. It's madness. I can't figure out what is causing these URLs to generate to be crawled in the first place! Does this sound familiar to anyone out there, any constructive suggestions? Robots text or maybe meta robots tags that would resolve this crawl issue?
Link Explorer | | RegistrarCorp0 -
Since January 24 Open Site Explorer has not provided new data
Using OSE we've been tracking our stats and our competitors stats on a weekly basis to see how we're doing. We've noticed that since Jan 24,2014 the numbers for us and our competitors have been the EXACTLY same. What's going on with OSE? Company_01.png
Link Explorer | | EricksonCoaching0 -
Is there an option that's more precise over Open Site Explorer?
I've had folks explain to me before that OpenSiteExplorer is just an estimation, etc. but there are some fairly easy statistics that seem to be different and it makes me nervous that either its wrong or I'm doing something wrong. As you can see in the image, moz isn't read social metrics correctly. It actually used to be pretty spot on, but as you can see with Google+'s its off. Maybe not a big deal right? But for my clients new Moz Analytic print outs it makes somewhat of a difference. Any help? UlwQs3w
Link Explorer | | jonnyholt0