Very wierd pages. 2900 403 errors in page crawl for a site that only has 140 pages.
-
Hi there,
I just made a crawl of the website of one of my clients with the crawl tool from moz.
I have 2900 403 errors and there is only 140 pages on the website.
I will give an exemple of what the crawl error gives me.
|
http://www.mysite.com/en/www.mysite.com/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en
|
|
|
|
|
|
|
|
|
|
There are 2900 pages like this.
I have tried visiting the pages and they work, but they are only html pages without CSS.
Can you guys help me to see what the problems is. We have experienced huge drops in traffic since Septembre.
-
Thank you so much for your response!
Yes. Could you please email me at eliotostiguy@gmail.com? I will be able to give you the url via email
-
Almost right, but 'just about' wrong; the 403 error is only served once an URL 'is' accessed. The content may not be accessible (as it's forbidden) but the URL itself, still is. Whilst it's unlikely that these URLs would ever be indexed, there's still an infinite loop in the link architecture which could impact upon crawl allowance and site health metrics
I'd get it sorted out!
-
but 403 is a forbidden error so those pages wouldn't be getting accessed from google. Google can't access them which in this case is a good thing right.
-
This is almost assuredly a link-based architectural error. It will be something similar to this:
- You load a page on EN
- You click the EN flag or language icon
- Instead of just reloading the page you are already on (since you're already on EN) the link is coded wrong and adds another /EN/ layer to the URL
- Once the new URL loads, the problem can be repeated
- This creates infinity URLs on your site
- Bad for Google, and Moz's crawler
Bet you it's something like that. If you give me the exact URL I might even be able to find the flaw and detail it for you via email or something
-
Hi there,
Thanks so much for reaching out - Sam from Moz's Help Team here!
I'm just going to be reaching out to you directly from help@moz.com about this, after taking a look into your campaign and crawl. I'll be in touch soon!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to stop crawls for product review pages? Volusion site
Hi guys, I have a new Volusion website. the template we are using has its own product review page for EVERY product i sell (1500+) When a customer purchases a product a week later they receive a link back to review the product. This link sends them to my site, but its own individual page strictly for reviewing the product. (As oppose to a page like amazon, where you review the product on the same page as the actual listing.) **This is creating countless "duplicate content" and missing "title" errors. What is the most effective way to block a bot from crawling all these pages? Via robots txt.? a meta tag? ** Here's the catch, i do not have access to every individual review page, so i think it will need to be blocked by a robot txt file? What code will i need to implement? i need to do this on my admin side for the site? Do i also have to do something on the Google analytics side to tell google about the crawl block? Note: the individual URLs for these pages end with: *****.com/ReviewNew.asp?ProductCode=458VB Can i create a block for all url's that end with /ReviewNew.asp etc. etc.? Thanks! Pardon my ignorance. Learning slowly, loving MOZ community 😃 1354bdae458d2cfe44e0a705c4ec38dd
Technical SEO | | Jerrion0 -
Webmaster Crawl errors caused by Joomla menu structure.
Webmaster Tools is reporting crawl errors for pages that do not exist due to how my Joomla menu system works. Example, I have a menu item named "Service Area" that stores 3 sub items but no actual page for Service Area. This results in a URL like domainDOTcom/service-area/service-page.html Because the Service Area menu item is constructed in a way that shows the bot it is a link, I am getting a 404 error saying it can't find domainDOTcom/service-area/ (The link is to "javasript:;") Note, the error doesn't say domainDOTcom/service-area/javascript:; it just says /service-area/ What is the best way to handle this? Can I do something in robots.txt to tell the bot that this /service-area/ should be ignored but any page after /service-area/ is good to go? Should I just mark them as fixed as it's really not a 404 a human will encounter or is it best to somehow explain this to the bot? I was advised on google forums to try this, but I'm nervous about it. Disallow: /service-area/*
Technical SEO | | dwallner
Allow: /service-area/summerlin-pool-service.
Allow: /service-area/north-las-vegas
Allow: /service-area/centennial-hills-pool-service I tried a 301 redirect of /service-area to home page but then it pulls that out of the url and my landing pages become 404's. http://www.lvpoolcleaners.com/ Thanks for any advice! Derrick0 -
How do I influence what page on my site google shows for specific search phrases?
Hi People, My client has a site www.activeadventures.com. They provide adventure tours of New Zealand, South America and the Himalayas. These destinations are split into 3 folders in the site (eg: activeadventures.com/new-zealand, activeadventures.com/south-america etc....). The actual root folder of the site is generic information for all of the destinations whilst the destination specific folders are specific in their information for the destination in question. The Problem: If you search for say "Active New Zealand" or "Adventure Tours South America" our result that comes up is the activeadventures.com homepage rather than the destination folder homepage (eg: We would want activeadventures.com/new-zealand to be the landing page for people searching for "active new zealand"). Are there any ways in influence google as to what page on our site it chooses to serve up? Many thanks in advance. Conrad
Technical SEO | | activenz0 -
Site not passing page authority....
Hi, This site powertoolworld.co.uk is not passing page authority. In fact every page shows no links unless it has a link from an external source. Originally this site blocked Roger from crawling it but that block was lifted over 6 months ago. I also ran a crawl test last night and it shows the same thing. PA of 1 and no links. I would like to point out that the problem seems to be the same for all sites on the same platform. Which points me in the direction of code. for example there is a display: none tag in the ccs which is used to style where the side bar links are. It's a Blue Park platform. What could be causing the problem? Thanks in advance. EDIT Turns out that blocking the ezooms crawler stopped it from being included.
Technical SEO | | PowerToolWorld0 -
404 error due to a page which requires a login
what do I do with 404 errors reported in webmaster tools that are actually URLs where users are clicking a link that requires them to log in (so they get sent to a login page). what's the best practice in these cases? Thanks in advance!
Technical SEO | | joshuakrafchin0 -
We have duplicate page titles on the footer menu section of our site. Is this considered spammy?
When our new site was in development stages our digital agency convinced me that we should have duplicate menu links in the footer section of the site. The general justification being that the menu links are key word relevant. I have received opposing opinion from SEO advisers indicating that these duplicate menu links could be considered 'spammy'. I would appreciate some views on this please
Technical SEO | | saints0 -
Having a massive amount of duplicate crawl errors
Im having over 400 crawl errors over duplicate content looking like this: http://www.mydomain.com/index.php?task=login&prevpage=http%3A%2F%2Fwww.mydomain.com%2Ftag%2Fmahjon http://www.mydomain.com/index.php?task=login&prevpage=http%3A%2F%2Fwww.mydomain.com%2Findex.php%3F etc.. etc... So there seems to be something with my login script that is not working, Anyone knows how to fix this? Thanks
Technical SEO | | stanken0 -
Why won't the Moz plug in "Analyze Page" tool read data on a Big Commerce site?
We love our new Big Commerce site, just curious as to what the hang up is.
Technical SEO | | spalmer0