Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Crawlers crawl weird long urls
-
I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls.
For example (to be clear):
there is a page: www.website.com/dogs/dog.html
but then it is continuing crawling:
www.website.com/dogs/dog.html
www.website.com/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dogs/dog.htmlwhat can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website
-
Answer from Screaming Frog!
The reason the SEO spider is crawling these URLs, is due to incorrect relative linking on the site from the login URL.
It's actually when the spider crawls the login page, http://www.website.com/login?returnurl=%2F which then leads to this URL http://www.website.com/Home/ctl/SendPassword?returnurl=http:/www.website.com/ and then this /home/ sub directory URL http://www.website.com/Home/ctl/page/dogs.aspx which links to http://www.website.com/Home/ctl/page/page/dogs.aspx and so on and so forth. This is the path to the incorrect relative linking (attached for you).To stop this, you can correct the incorrect relative linking, or easier, simply exclude the login page.
-
Wow, Big mistakes are made one Home
maybe because of the .aspx. extension? alle pages have seo-friendly urls
Thanks Wesley and Paddy Displays
-
I see a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/HeutinkICT.aspx from http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx.
It's the bottom left block which causes this link. This way you will get a big nesting effect.
-
OK found one problem
on this page
http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx
you have a link to
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/LesscherIT.aspx
which i think should be
-
ok I did a quick screaming fog and I think I have an idea, you just have to follow the breadcrumbs
You said in you example "In Links 9", you need to find out what those pages are and follow it back to the point of origin As I think its just one bad link that cause this nested link effect.
eg
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/OverOdin/HeutinkICT.aspx
is being linked from
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/StationtoStation.aspx (as well as others)
You just have to follow that trail till you find the source of the problem
-
every link, except the hompage itself
-
I can't see any source:
The pages are like:
| URL | www.website.com/page/ |
| Status Code | 200 |
| Status | OK |
| Type | text/html; charset=utf-8 |
| Size | 55811 |
| Title | |
| Level | 10 |
| In Links | 9 |
| Out Links | 38 | -
Which URL(s) is/are causing problems?
-
please be free to check: http://tinyurl.com/lox7le9
-
You don't necessarily have to remove the link. As long as you can verify that it directs to the right page.
But curious to see what caused the problem
-
I think Screaming Frog will tell you the page it found the weird url, then you can check the source, and find out whats producing that link.
-
That is a good one! It's true that I have the same linking to the page itself. I will remove all that kind of links first and crawl again. I'll keep you in touch!
-
Are you somehow linking to www.website.com/dogs/dog.html from the page itself? There could be something wrong with that link.
I made a small mistake not so long ago with a redirection plugin. I told it to go to domain.com. This plugin was looking at the base + what i told it to. So it went to: domain.com/domain.com. Perhaps you made a similar mistake.Maybe you can send me the URL and i can take a look at it?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Ooops. Our crawlers are unable to access that URL
hello
Moz Pro | | ssblawton2533
i have enter my site faroush.com but i got an error
Ooops. Our crawlers are unable to access that URL - please check to make sure it is correct
what is problem ?0 -
URL Length Issue
MOZ is telling me the URLs are too long. I did a little research and I found out that the length of the URLs is not really a serious problem. In fact, others recommend ignoring the situation. Even on their blog I found this explanation: "Shorter URLs are generally preferable. You do not need to take this to the extreme, and if your URL is already less than 50-60 characters, do not worry about it at all. But if you have URLs pushing 100+ characters, there's probably an opportunity to rewrite them and gain value. This is not a direct problem with Google or Bing - the search engines can process long URLs without much trouble. The issue, instead, lies with usability and user experience. Shorter URLs are easier to parse, copy and paste, share on social media, and embed, and while these may all add up to a fractional improvement in sharing or amplification, every tweet, like, share, pin, email, and link matters (either directly or, often, indirectly)." And yet, I have these questions: In this case, why do I get this error telling me that the urls are too long, and what are the best practices to get this out? Thank You
Moz Pro | | Cart_generation1 -
Youtube traffic page url referral
Hello, How can I see which videos from Youtube that has my domain inserted in their description url drive traffic to my domain? I can see in GA how many visitors are coming from Youtube to my domain, but I can't see what Youtube video pages has driven traffic. Any help?
Moz Pro | | xeonet320 -
How to track data from old site and new site with the same URL?
We are launching a new site within the next 48 hours. We have already purchased the 30 day trial and we will continue to use this tool once the new site is launched. Just looking for some tips and/or best practices so we can compare the old data vs. the new data moving forward....thank you in advance for your response(s). PB3
Moz Pro | | Issuer_Direct0 -
FollowerWonk: How long have I been following someone?
Is there a way within followerwonk to find out how long you have been following someone for or indeed how long they have been following you for? I have downloaded the excel document from the "sort your followers" tab Which has sections called. "follows @name" and "@name follows" but these just give a last checked date(if they do follow/are being followed) rather than the metric I want!
Moz Pro | | BBI_Brandboost0 -
What user agent is used by SEOMOZ crawler?
We have a pretty tight robots.txt file in place to only allow the major search engines. I do not want to block SEOMOZ.ORG from being able to crawl the site so I want to make sure the user agent is open.
Moz Pro | | eseider0 -
How to force SeoMoz to re-crawl my website?
Hi, I have done a lot of changes on my website to comply with SeoMoz advices, now I would like to see if I have better feedback from the tool, how can I force it to re-crawl a specific campaign? (waiting another week is too long :-))
Moz Pro | | oumma0 -
Duplicate page titles are the same URL listed twice
The system says I have two duplicate page titles. The page titles are exactly the same because the two URLs are exactly the same. These same two identical URLs show up in the Duplicate Page Content also - because they are the same. We also have a blog and there are two tag pags showing identical content - I have blocked the blog in robots.txt now, because the blog is only for writers. I suppose I could have just blocked the tags pages.
Moz Pro | | loopyal0