Crawlers crawl weird long urls
-
I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls.
For example (to be clear):
there is a page: www.website.com/dogs/dog.html
but then it is continuing crawling:
www.website.com/dogs/dog.html
www.website.com/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dogs/dog.htmlwhat can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website
-
Answer from Screaming Frog!
The reason the SEO spider is crawling these URLs, is due to incorrect relative linking on the site from the login URL.
It's actually when the spider crawls the login page, http://www.website.com/login?returnurl=%2F which then leads to this URL http://www.website.com/Home/ctl/SendPassword?returnurl=http:/www.website.com/ and then this /home/ sub directory URL http://www.website.com/Home/ctl/page/dogs.aspx which links to http://www.website.com/Home/ctl/page/page/dogs.aspx and so on and so forth. This is the path to the incorrect relative linking (attached for you).To stop this, you can correct the incorrect relative linking, or easier, simply exclude the login page.
-
Wow, Big mistakes are made one Home
maybe because of the .aspx. extension? alle pages have seo-friendly urls
Thanks Wesley and Paddy Displays
-
I see a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/HeutinkICT.aspx from http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx.
It's the bottom left block which causes this link. This way you will get a big nesting effect.
-
OK found one problem
on this page
http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx
you have a link to
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/LesscherIT.aspx
which i think should be
-
ok I did a quick screaming fog and I think I have an idea, you just have to follow the breadcrumbs
You said in you example "In Links 9", you need to find out what those pages are and follow it back to the point of origin As I think its just one bad link that cause this nested link effect.
eg
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/OverOdin/HeutinkICT.aspx
is being linked from
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/StationtoStation.aspx (as well as others)
You just have to follow that trail till you find the source of the problem
-
every link, except the hompage itself
-
I can't see any source:
The pages are like:
| URL | www.website.com/page/ |
| Status Code | 200 |
| Status | OK |
| Type | text/html; charset=utf-8 |
| Size | 55811 |
| Title | |
| Level | 10 |
| In Links | 9 |
| Out Links | 38 | -
Which URL(s) is/are causing problems?
-
please be free to check: http://tinyurl.com/lox7le9
-
You don't necessarily have to remove the link. As long as you can verify that it directs to the right page.
But curious to see what caused the problem
-
I think Screaming Frog will tell you the page it found the weird url, then you can check the source, and find out whats producing that link.
-
That is a good one! It's true that I have the same linking to the page itself. I will remove all that kind of links first and crawl again. I'll keep you in touch!
-
Are you somehow linking to www.website.com/dogs/dog.html from the page itself? There could be something wrong with that link.
I made a small mistake not so long ago with a redirection plugin. I told it to go to domain.com. This plugin was looking at the base + what i told it to. So it went to: domain.com/domain.com. Perhaps you made a similar mistake.Maybe you can send me the URL and i can take a look at it?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Facebook URLs, Anchor Text
I have a client that is considering a facebook url change. For ease of explanation, let's say their currently existing URL is facebook.com/Company123. I've googled their currently existing facebook url and found a dozen or so websites that include the text, "facebook.com/Company123". But, these results don't include websites that have an anchor text of, for example, "Facebook" and a link pointing to facebook.com/Company123. Has anybody had success tracking down any/all websites that point to a specific Facebook url? I've tried Open Site Explorer, OpenLinkprofiler, RankSignals, and SEO SpyGlass to no avail. Thank you!
Moz Pro | | OMTAnno0 -
Major Drop in URLs that Received Organic Search Visits
Hi there! I can't figure out this issue. We had a major drop in URLS receiving organic search visits (according to Koz) yet our traffic hasn't seemed to drop in correlation. Our site had a major issue at the end of may (*hence the drop in search traffic) but it seems to have recovered, yet URLs that Received Organic Search Visits has dropped dramatically. I've looked at Google updates but can't seem to find anything, what am I missing? NVun4Wq
Moz Pro | | DotP0 -
Special Characters in URL & Google Search Engine (Index & Crawl)
G'd everyone, I need help with understanding how special characters impact SEO. Eg. é , ë ô in words Does anyone have good insights or reference material regarding the treatment of Special Characters by Google Search Engine? how Page Title / Meta Desc with Special Chars are being index & Crawl Best Practices when it comes to URLs - uses of Unicode, HTML entity references - when are where? any disadvantage using special characters Does special characters in URL have any impact on SEO performance & User search, experience. Thanks heaps, Amy
Moz Pro | | LabeliumUSA0 -
No (seomoz) crawler report since 7th may !!
Hi Mozteam, I added a site less than 200 pages in the tool "seomoz crawler", at May 7. We are May 15 and the tool always displays "crawl in progress." Do you have a problem about this tool? This is embarrassing ... Thank you for your reply. David France
Moz Pro | | DavidEichholtzer0 -
Order of urls in SEOMoz crawl report
Is there any rhyme or reason to the order of urls in the SEOMoz crawl report, or are the urls just listed in random order?
Moz Pro | | LynnMarie0 -
Where does the crawler find the urls?
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful 🙂 however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak thanks 🙂
Moz Pro | | Fammy1 -
What tools can I use to crawl a site which uses #! hasbhang?
I have a site which was created in a way that it uses hasbang #!. I am using 3 different SEO tools and they can't seem to crawl the website. Or what suggestion can you give me in dealing with hasbang. Any ideas please. Thanks a lot for your help. Allan
Moz Pro | | AllanDuncan0 -
Only 1 page has been crawled. Why?
I set a new profile up a fortnight ago. Last week seomoz crawled the entire site (10k pages), and this week has only crawled 1 page. Nothing's changed on the site that I'm aware of, so what's happened?
Moz Pro | | tompollard0