Crawlers crawl weird long urls
-
I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls.
For example (to be clear):
there is a page: www.website.com/dogs/dog.html
but then it is continuing crawling:
www.website.com/dogs/dog.html
www.website.com/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dogs/dog.htmlwhat can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website
-
Answer from Screaming Frog!
The reason the SEO spider is crawling these URLs, is due to incorrect relative linking on the site from the login URL.
It's actually when the spider crawls the login page, http://www.website.com/login?returnurl=%2F which then leads to this URL http://www.website.com/Home/ctl/SendPassword?returnurl=http:/www.website.com/ and then this /home/ sub directory URL http://www.website.com/Home/ctl/page/dogs.aspx which links to http://www.website.com/Home/ctl/page/page/dogs.aspx and so on and so forth. This is the path to the incorrect relative linking (attached for you).To stop this, you can correct the incorrect relative linking, or easier, simply exclude the login page.
-
Wow, Big mistakes are made one Home
maybe because of the .aspx. extension? alle pages have seo-friendly urls
Thanks Wesley and Paddy Displays
-
I see a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/HeutinkICT.aspx from http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx.
It's the bottom left block which causes this link. This way you will get a big nesting effect.
-
OK found one problem
on this page
http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx
you have a link to
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/LesscherIT.aspx
which i think should be
-
ok I did a quick screaming fog and I think I have an idea, you just have to follow the breadcrumbs
You said in you example "In Links 9", you need to find out what those pages are and follow it back to the point of origin As I think its just one bad link that cause this nested link effect.
eg
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/OverOdin/HeutinkICT.aspx
is being linked from
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/StationtoStation.aspx (as well as others)
You just have to follow that trail till you find the source of the problem
-
every link, except the hompage itself
-
I can't see any source:
The pages are like:
| URL | www.website.com/page/ |
| Status Code | 200 |
| Status | OK |
| Type | text/html; charset=utf-8 |
| Size | 55811 |
| Title | |
| Level | 10 |
| In Links | 9 |
| Out Links | 38 | -
Which URL(s) is/are causing problems?
-
please be free to check: http://tinyurl.com/lox7le9
-
You don't necessarily have to remove the link. As long as you can verify that it directs to the right page.
But curious to see what caused the problem
-
I think Screaming Frog will tell you the page it found the weird url, then you can check the source, and find out whats producing that link.
-
That is a good one! It's true that I have the same linking to the page itself. I will remove all that kind of links first and crawl again. I'll keep you in touch!
-
Are you somehow linking to www.website.com/dogs/dog.html from the page itself? There could be something wrong with that link.
I made a small mistake not so long ago with a redirection plugin. I told it to go to domain.com. This plugin was looking at the base + what i told it to. So it went to: domain.com/domain.com. Perhaps you made a similar mistake.Maybe you can send me the URL and i can take a look at it?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content - Multiple URL's
I know a few of these problems come from products being in the same categories but I have no idea how to get rid of the url's that are showing duplicate content when the product is in the exact same place. Hard to explain, but here are URL examples. http://www.ocelco.com/store/pc/www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-688p3308.htm http://www.ocelco.com/store/pc/www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-696p3308.htm http://www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-p3308.htm http://www.ocelco.com/store/pc/Bathtub-Floor-Corner-Stainless-Steel-Grab-Bar-Right-Hand-left-hand-pictured-688p3308.htm Any Idea's how to fix / get rid of these URL's? Thanks!
Moz Pro | | Mike.Bean0 -
Issue: Title Element Too Long !
hello , i found this issue in seomoz campaigns. i see many links blocked by google for this reason . but something wrong here for example i see this link like that in seomoz dashboard Wagdy Hassan, Author at Seo Seek : Seo Tutorials , Seo Tools , Make Money Online http://seo-seek.com/author/wagdys/ and its already not in google. but when i open the link http://seo-seek.com/author/wagdys/ the title is " Wagdy Hassan " i fixed it 6 days ago. what is wrong with the site? also i still waiting google to put the new results .. wait for your answer, Thanks 🙂
Moz Pro | | Wagdys0 -
Why does Crawl Diagnostics report this as duplicate content?
Hi guys, we've been addressing a duplicate content problem on our site over the past few weeks. Lately, we've implemented rel canonical tags in various parts of our ecommerce store, over time, and observing the effects by both tracking changes in SEOMoz and Websmater tools. Although our duplicate content errors are definitely decreasing, I can't help but wonder why some URLs are still being flagged with duplicate content by our SEOmoz crawler. Here's an example, taken directly from our Crawl Diagnostics Report: URL with 4 Duplicate Content errors:
Moz Pro | | yacpro13
/safety-lights.html Duplicate content URLs:
/safety-lights.html ?cat=78&price=-100
/safety-lights.html?cat=78&dir=desc&order=position /safety-lights.html?cat=78 /safety-lights.html?manufacturer=514 What I don't understand, is all of the URLS with URL parameters have a rel canonical tag pointing to the 'real' URL
/safety-lights.html So why is SEOMoz crawler still flagging this as duplicate content?0 -
Canonical URLs for Search Parameters
Hi Guys Our seomoz campaign report is returning a lot or Rel Canonical issues similar to this for each page. The non / version redirects to the / version but how do I get the ones with search parameters ie '?datefrom&nights' to redirect. http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78
Moz Pro | | JohnTulley
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom&nights
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom=&nights= Any help would be welcome, thanks0 -
Settings to crawl entire site
Not sure what happened but I started a third campaign yesterday and only 1 pages was crawled, The other two campaigns has 472 and 10K respectively. What is the proper setting to choose in the beginning of campaign setup to have the entire site crawled. Not sure what I did different and I must be reading the instructions incorrectly. Thanks, Don
Moz Pro | | NicheGuy210 -
Amount of Pages Crawled Dropped Significantly
I am just wondering if something changed with the SEOMoz crawler. I was always getting 10,000 or near 10,000 pages crawled. After the last two crawls I am ending up around 2500 pages. Has anything changed that I would need to look at it see if I am blocking the crawler or something else?
Moz Pro | | jeffmace0 -
Can I specify a url for a keyword in the rank checker tool?
Hello! I'm new to seomoz and excited to learn the system. I created a campaign and added keywords but I'm not clear how the seomoz campaign rankings tool works. As an example, one of my keywords 'cigar cutters' is reporting at position 20 for url http://www.cheaphumidors.com/c_guillotine-cutters.html. However, I think it would be better target to focus that keyword on http://www.cheaphumidors.com/c_cutters.html. as a search for 'cigar cutters' could encompass either a guillotine cutter, punch cutter or cigar scissors. Is there any way to assign http://www.cheaphumidors.com/c_cutters.html to the term 'cigar cutters' in the campaign ranking report? Brian
Moz Pro | | davesabot0 -
My Campaign has been crawling for about a week now
Can anyone tell me why one of my campaigns has been stuck in crawl mode for about a full week and it is still not done?!?!
Moz Pro | | nazmiyal0