Very wierd pages. 2900 403 errors in page crawl for a site that only has 140 pages.
-
Hi there,
I just made a crawl of the website of one of my clients with the crawl tool from moz.
I have 2900 403 errors and there is only 140 pages on the website.
I will give an exemple of what the crawl error gives me.
|
http://www.mysite.com/en/www.mysite.com/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en
|
http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en
|
|
|
|
|
|
|
|
|
|
There are 2900 pages like this.
I have tried visiting the pages and they work, but they are only html pages without CSS.
Can you guys help me to see what the problems is. We have experienced huge drops in traffic since Septembre.
-
Thank you so much for your response!
Yes. Could you please email me at eliotostiguy@gmail.com? I will be able to give you the url via email
-
Almost right, but 'just about' wrong; the 403 error is only served once an URL 'is' accessed. The content may not be accessible (as it's forbidden) but the URL itself, still is. Whilst it's unlikely that these URLs would ever be indexed, there's still an infinite loop in the link architecture which could impact upon crawl allowance and site health metrics
I'd get it sorted out!
-
but 403 is a forbidden error so those pages wouldn't be getting accessed from google. Google can't access them which in this case is a good thing right.
-
This is almost assuredly a link-based architectural error. It will be something similar to this:
- You load a page on EN
- You click the EN flag or language icon
- Instead of just reloading the page you are already on (since you're already on EN) the link is coded wrong and adds another /EN/ layer to the URL
- Once the new URL loads, the problem can be repeated
- This creates infinity URLs on your site
- Bad for Google, and Moz's crawler
Bet you it's something like that. If you give me the exact URL I might even be able to find the flaw and detail it for you via email or something
-
Hi there,
Thanks so much for reaching out - Sam from Moz's Help Team here!
I'm just going to be reaching out to you directly from help@moz.com about this, after taking a look into your campaign and crawl. I'll be in touch soon!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Speed or Site Speed which one does Google considered a ranking signal
I've read many threads online which proves that website speed is a ranking factor. There's a friend whose website scores 44 (slow metric score) on Google Pagespeed Insights. Despite that his website is slow, he outranks me on Google search results. It confuses me that I optimized my website for speed, but my competitor's slow site outperforms me. On Six9ja.com, I did amazing work by getting my target score which is 100 (fast metric score) on Google Pagespeed Insights. Coming to my Google search console tool, they have shown that some of my pages have average scores, while some have slow scores. Google search console tool proves me wrong that none of my pages are fast. Then where did the fast metrics went? Could it be because I added three Adsense Javascript code to all my blog posts? If so, that means that Adsense code is slowing website speed performance despite having an async tag. I tested my blog post speed and I understand that my page speed reduced by 48 due to the 3 Adsense javascript codes added to it. I got 62 (Average metric score). Now, my site speed is=100, then my page speed=62 Does this mean that Google considers page speed rather than site speed as a ranking factor? Screenshots: https://imgur.com/a/YSxSwOG Regarding: https://etcnaija.com
Technical SEO | | etcna0 -
Website crawl error
Hi all, When I try to crawl a website, I got next error message: "java.lang.IllegalArgumentException: Illegal cookie name" For the moment, I found next explanation: The errors indicate that one of the web servers within the same cookie domain as the server is setting a cookie for your domain with the name "path", as well as another cookie with the name "domain" Does anyone has experience with this problem, knows what it means and knows how to solve it? Thanks in advance! Jens
Technical SEO | | WeAreDigital_BE0 -
Tools/Software that can crawl all image URLs in a site
Excluding Screaming Frog, what other tools/software to use in order to crawl all image URLs in a site? Because in Screaming Frog, they don't crawl image URLs which are not under the site domain. Example of an image URL outside the client site: http://cdn.shopify.com/images/this-is-just-a-sample.png If the client is: http://www.example.com, Screaming Frog only crawls images under it like, http://www.example.com/images/this-is-just-a-sample.png
Technical SEO | | jayoliverwright0 -
I have a 404 error on my site i can't find.
I have looked everywhere. I thought it might have just showed up while making some changes, so while in webmaster tools i said it was fixed.....It's still there. Even moz pro found it. error is http://mydomain.com/mydomain.com No idea how it even happened. thought it might be a plugin problem. Any ideas how to fix this?
Technical SEO | | NateStewart0 -
Pageing page and seo meta tag questions
Hi if i am using paging in my website there is lots of product in my website now in paging total paging is 1000 pages now what title tag i need to add for every paging page or is there any good way we can tell search engine all page or same ?
Technical SEO | | constructionhelpline0 -
Page title vs page element
Hello! I'm new to SEO as my question would imply. Can someone show me the difference between a page title and a page element? Thank you!
Technical SEO | | atrenary1 -
Trying to reduce pages crawled to within 10K limit via robots.txt
Our site has far too many pages for our 10K page PRO account which are not SEO worthy. In fact, only about 2000 pages qualify for SEO value. Limitations of the store software only permit me to use robots.txt to sculpt the rogerbot site crawl. However, I am having trouble getting this to work. Our biggest problem is the 35K individual product pages and the related shopping cart links (at least another 35K); these aren't needed as they duplicate the SEO-worthy content in the product category pages. The signature of a product page is that it is contained within a folder ending in -p. So I made the following addition to robots.txt: User-agent: rogerbot
Technical SEO | | AspenFasteners
Disallow: /-p/ However, the latest crawl results show the 10K limit is still being exceeded. I went to Crawl Diagnostics and clicked on Export Latest Crawl to CSV. To my dismay I saw the report was overflowing with product page links: e.g. www.aspenfasteners.com/3-Star-tm-Bulbing-Type-Blind-Rivets-Anodized-p/rv006-316x039354-coan.htm The value for the column "Search Engine blocked by robots.txt" = FALSE; does this mean blocked for all search engines? Then it's correct. If it means "blocked for rogerbot? Then it shouldn't even be in the report, as the report seems to only contain 10K pages. Any thoughts or hints on trying to attain my goal would REALLY be appreciated, I've been trying for weeks now. Honestly - virtual beers for everyone! Carlo0