Mystery 404's
-
I have a large number of 404's that all have a similar structure: www.kempruge.com/example/kemprugelaw. kemprugelaw keeps getting stuck on the end of url's. While I created www.kempruge.com/example/ I never created the www.kempruge.com/example/kemprugelaw page or edited permalinks to have kemprugelaw at the end of the url. Any idea how this happens? And what I can do to make it stop?
Thanks,
Ruben
-
One by one is fine with me. I'd much prefer that to screwing up the site.
Thanks again,
Ruben
-
Hi Ruben
I'm glad that has helped you
There is one way you could do multiple updates BUT I would not recommend it as doing it wrong could screw up your site. You could do it via the control panel in your site's hosting by querying your MySQL database via PHPMyAdmin and doing a bulk search and update for all references to www.kempruge.com where it doesn't have http:// in front and replacing www.kemruge.com with http://www.kempruge.com.
Although it is a pain I know, the best way is to fix the errors one by one in the pages themselves and leave the redirects running until you are sure that Google, Bing and Yahoo have updated their indexes, then you can remove them.
If you copy http:// onto your Mac/PC clipboard, then it will make it quicker to open the link dialog and paste at the start of the URL.
Peter
-
Peter,
You're a genius! I'm almost certain that's it, because I can't remember adding "http://" Is there a way to get rid of those pages? I just 301 redirected them to where they are supposed to go, but I have a lot of redirects. When I say a lot, I mean a lot relative to how many pages I have. We have 500 something indexed pages, and probably 200 something redirects. I know that many redirects slows our site down. I'd like to know if there's any better option that the 301s, if I can't just delete them.
Thanks,
Ruben
-
Hi Ruben
You mentioned: In GWT, the 404s are slightly different. They are www.kempruge.com/example/www.kempruge.com
I have seen this type of thing before, or something similar, when an absolute link has been entered into some anchor text or by itself without adding http:// before the link.
So the link has been entered as www.mydomain.com - which causes the error - but it should be entered as http://www.mydomain.com
Your issue may be something completely different, but I thought I would post this as a possible solution.
Peter
-
In GWT, the 404s are slightly different. They are www.kempruge.com/example/www.kempruge.com
In BWT, it's the www.kempruge.com/example/kemprugelaw
In GWT, they say the 404's are coming from my site, but I couldn't find out where it says that for BWT.
Any thoughts, and thanks for helping out. This has been bothering me for awhile.
Ruben
-
It says it in Webmaster Tools, does that matter? I'm going to check on where from now. Also, I know my sitemap 404's, but I can't figure out what happened. If you go here: http://www.kempruge.com/category/news/feed/ that's my sitemap. How it got changed to that, I have no idea. Plus, I can't find that page in the backend of WP to change the url back to the old one.
I tried redirecting the proper sitemap name to the one that works, but that didn't seem to work.
-
I crawled your site and didn't see the 404 errors.
I did notice that your sitemap in your robots.txt 404's so you may want to take a look at that.
-
Are you seeing these 404s in Webmaster Tools or when crawling the site?
If WMT where does it say the 404 is linked to from? Click on the URL with the 404 error in WMT and select the "Linked from" tab.
Crawl the site with Screaming Frog and your user agent set to Googlebot. See if the same 404 errors are being picked up and if so, you can click on them and select the "In Links" tab to see what page the 404 is being picked up on.
I checked the source code of some of the pages on www.kempruge.com and didn't see any relative links which usually create problems like this. My bet is on a site scraping your site and creating 404 errors when they link back to your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is a Rel Canonical Sufficient or Should I 'NoIndex'
Hey everyone, I know there is literature about this, but I'm always frustrated by technical questions and prefer a direct answer or opinion. Right now, we've got recanonicals set up to deal with parameters caused by filters on our ticketing site. An example is that this: http://www.charged.fm/billy-joel-tickets?location=il&time=day relcanonicals to... http://www.charged.fm/billy-joel-tickets My question is if this is good enough to deal with the duplicate content, or if it should be de-indexed. Assuming so, is the best way to do this by using the Robots.txt? Or do you have to individually 'noindex' these pages? This site has 650k indexed pages and I'm thinking that the majority of these are caused by url parameters, and while they're all canonicaled to the proper place, I am thinking that it would be best to have these de-indexed to clean things up a bit. Thanks for any input.
Intermediate & Advanced SEO | | keL.A.xT.o0 -
Putting "noindex" on a page that's in an iframe... what will that mean for the parent page?
If I've got a page that is being called in an iframe, on my homepage, and I don't want that called page to be indexed.... so I put a noindex tag on the called page (but not on the homepage) what might that mean for the homepage? Nothing? Will Google, Bing, Yahoo, or anyone else, potentially see that as a noindex tag on my homepage?
Intermediate & Advanced SEO | | Philip-DiPatrizio0 -
A few questions on Google's Structured Data Markup Helper...
I'm trying to go through my site and add microdata with the help of Google's Structured Data Markup Helper. I have a few questions that I have not been able to find an answer for. Here is the URL I am referring to: http://www.howlatthemoon.com/locations/location-chicago My company is a bar/club, with only 4 out of 13 locations serving food. Would you mark this up as a local business or a restaurant? It asks for "URL" above the ratings. Is this supposed to be the URL that ratings are on like Yelp or something? Or is it the URL for the page? Either way, neither of those URLs are on the page so I can't select them. If it is for Yelp should I link to it? How do I add reviews? Do they have to be on the page? If I make a group of days for Day of the Week for Opening hours, such as Mon-Thu, will that work out? I have events on this page. However, when I tried to do the markup for just the event it told me to use itemscope itemtype="http://schema.org/Event" on the body tag of the page. That is just a small part of the page, I'm not sure why I would put the event tag on the whole body? Any other tips would be much appreciated. Thanks!
Intermediate & Advanced SEO | | howlusa0 -
Two Pages with the Same Name Different URL's
I was hoping someone could give me some insight into a perplexing issue that I am having with my website. I run an 20K product ecommerce website and I am finding it necessary to have two pages for my content: 1 for content category pages about wigets one for shop pages for wigets 1st page would be .com/shop/wiget/ 2nd page would be .com/content/wiget/ The 1st page would be a catalogue of all the products with filters for the customer to narrow down wigets. So ultimately the URL for the shop page could look like this when the customer filters down... .com/shop/wiget/color/shape/ The second page would be content all about the Wigets. This would be types of wigets colors of wigets, how wigets are used, links to articles about wigets etc. Here are my questions. 1. Is it bad to have two pages about wigets on the site, one for shopping and one for information. The issue here is when I combine my content wiget with my shop wiget page, no one buys anything. But I want to be able to provide Google the best experience for rankings. What is the best approach for Google and the customer? 2. Should I rel canonical all of my .com/shop/wiget/ + .com/wiget/color/ etc. pages to the .com/content/wiget/ page? Or, Should I be canonicalizing all of my .com/shop/wiget/color/etc pages to .com/shop/wiget/ page? 3. Ranking issues. As it is right now, I rank #1 for wiget color. This page on my site would be .com/shop/wiget/color/ . If I rel canonicalize all of my pages to .com/content/wiget/ I am going to loose my rankings because all of my shop/wiget/xxx/xxx/ pages will then point to .com/content/wiget/ page. I am just finding with these massive ecommerce sites that there is WAY to much potential for duplicate content, not enough room to allow Google the ability to rank long tail phrases all the while making it completely complicated to offer people pages that promote buying. As I said before, when I combine my content + shop pages together into one page, my sales hit the floor (like 0 - 15 dollars a day), when i just make a shop page my sales are like (1k+ a day). But I have noticed that ever since Penguin and Panda my rankings have fallen from #1 across the board to #15 and lower for a lot of my phrase with the exception of the one mentioned above. This is why I want to make an information page about wigets and a shop page for people to buy wigets. Please advise if you would. Thanks so much for any insight you can give me!
Intermediate & Advanced SEO | | SKP0 -
Why is google automatically showing my competitor's result even when customer types in our brand name on the query?
This is little weird. We run a website specific to mobile phones called as 91mobiles.com. The site has gained lot of user interest and trust in the last 2 years [pre-dominantly indian users]. Lot of our users type mobile phone model and 91mobiles as the query. Example "Sony Xperia P 91mobiles". Google is showing gsmarena.com results on top and then shows our own results below! What's annoying is the fact that google also bolds the term gsmarena [denoting that it's a synonym]. Any idea why this is happening? We are very sure that we are not doing anything wrong..We have worked really hard for the last 2 years to reach where we are..and it's kind of hard to see gsmarena siphoning away our traffic for no reason at all [even when customer types in 91mobiles a part of the query to quality it]...Can some experts here demystify this? Thanks.
Intermediate & Advanced SEO | | Gaadi0 -
A Noob's SEO Plan of attack... can you critique it for me?
I've been digging my teeth into SEO for a solid 1.5 weeks or so now and I've learned a tremendous amount. However, I realize I have only scratched the surface still. One of the hardest things I've struggled with is the sheer amount of information and feeling overwhelmed. I finally think I've found a decent path. Please critique and offer input, it would be much appreciated. Step One: Site Architecture I run an online proofreading & editing service. That being said, there are lots of different segment we would eventually like to rank for other than the catch-all phrases like 'proofreading service'. For example, 'essay editing', 'resume editing', 'book editing', or even 'law school personal statement editing'. I feel that my first step is to understand how my site is built to handle this plan now, and into the future. Right now we simply have the homepage and one segment: kibin.com/essay-editing. Eventually, we will have a services page that serves almost like a site-map, showing all of our different services and linking to them. Step Two: Page Anatomy I know it is important to have a well defined anatomy to these services pages. For example, we've done a decent job with 'above the fold' content, but now understand the importance of putting the same type of care in below the fold. The plan here is to have a section for recent blog posts that pertain to that subject in a section titled "Essay Editing and Essay Writing Tips & Advice", or something to that effect. Also including some social sharing options, other resources, and an 'about us' section to assist with keyword optimization is in the plan. Step Three: Page Optimization Once we're done with Step Two, I feel that we'll finally be ready to truly optimize each of our pages. We've down some of this already, but probably less than 50%. You can see evidence of this on our essay editing page and proofreading rates page. So, the goal here is to find the most relevant keywords for each page and optimize for those to the point we have A grades on our on-page optimization reports. Step Four: Content/Passive Link Building The bones for our content strategy is in place. We have sharing links on blog posts already in place and a slight social media presence already. I admit, the blog needs some tightening up, and we can do a lot more on our social channels. However, I feel we need to start by creating content that our audience is interested in and interacting with them on a consistent basis. I do not feel like I should be chasing link building strategies or guest blog posts at this time. PLEASE correct me if I'm off base here, but only after reading step five: Step Five: Active Link Building My bias is to get some solid months of creating content and building a good social media presence where people are obviously interacting with our posts and sharing our content. My reasoning is that it will make it much easier for me to reach out to bloggers for guest posts as we'll be much more reputable after spending time doing step 4. Is this poor thinking? Should I try to get some guest blog posts in during step 4 instead? Step Six: Test, Measure, Refine I'll admit, I have yet to really dive into learning about the different ways to measure our SEO efforts. Besides being set up with our first campaign as an SEOPro Member and having 100 or so keywords and phrases we're tracking... I'm really not sure what else to do at this point. However, I feel we'll be able to measure the popularity of each blog post by number of comments, shares, new links, etc. once I reach step 6. Is there something vital I'm missing or have forgotten here? I'm sorry for the long winded post, but I'm trying to get my thoughts straight before we start cranking on this plan. Thank you so much!
Intermediate & Advanced SEO | | TBiz2 -
Best solution to get mass URl's out the SE's index
Hi, I've got an issue where our web developers have made a mistake on our website by messing up some URL's . Because our site works dynamically IE the URL's generated on a page are relevant to the current URL it ment the problem URL linked out to more problem URL's - effectively replicating an entire website directory under problem URL's - this has caused tens of thousands of URL's in SE's indexes which shouldn't be there. So say for example the problem URL's are like www.mysite.com/incorrect-directory/folder1/page1/ It seems I can correct this by doing the following: 1/. Use Robots.txt to disallow access to /incorrect-directory/* 2/. 301 the urls like this:
Intermediate & Advanced SEO | | James77
www.mysite.com/incorrect-directory/folder1/page1/
301 to:
www.mysite.com/correct-directory/folder1/page1/ 3/. 301 URL's to the root correct directory like this:
www.mysite.com/incorrect-directory/folder1/page1/
www.mysite.com/incorrect-directory/folder1/page2/
www.mysite.com/incorrect-directory/folder2/ 301 to:
www.mysite.com/correct-directory/ Which method do you think is the best solution? - I doubt there is any link juice benifit from 301'ing URL's as there shouldn't be any external links pointing to the wrong URL's.0 -
Culling 99% of a website's pages. Will this cause irreparable damage?
I have a large travel site that has over 140,000 pages. The problem I have is that the majority of pages are filled with dupe content. When Panda came in, our rankings were obliterated, so I am trying to isolate the unique content on the site and go forward with that. The problem is, the site has been going for over 10 years, with every man and his dog copying content from it. It seems that our travel guides have been largely left untouched and are the only unique content that I can find. We have 1000 travel guides in total. My first question is, would reducing 140,000 pages to just 1,000 ruin the site's authority in any way? The site does use internal linking within these pages, so culling them will remove thousands of internal links throughout the site. Also, am I right in saying that the link juice should now move to the more important pages with unique content, if redirects are set up correctly? And finally, how would you go about redirecting all theses pages? I will be culling a huge amount of hotel pages, would you consider redirecting all of these to the generic hotels page of the site? Thanks for your time, I know this is quite a long one, Nick
Intermediate & Advanced SEO | | Townpages0