Having issues crawling a website
-
We looked to use the Screaming Frog Tool to crawl this website and get a list of all meta-titles from the site, however, it only resulted with the one result - the homepage.
We then sought to obtain a list of the URLs of the site by creating a sitemap using https://www.xml-sitemaps.com/. Once again however, we just go the one result - the homepage.
There is something that seems to be restricting these tools from crawling all pages. If you anyone can shed some light as to what this could be, we'd be most appreciative.
-
That robots.txt should be fine.. its not blocking anything.
The reason the crawl is stopping on the homepage is this code:
<meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">nofollow</a>">
Which tells bots to not follow any links on the page. Remove that and you should be good.
-
Hi,
I think it is your robots.txt file that is causing the issue. At the moment you have the following:
**User-agent: ***
Disallow:
I would recommend updating it to the following:
**User-agent: ***
Allow: /
Moz also has a good post about what else you can include in your robots.txt file for best practices etc. :
https://moz.com/learn/seo/robotstxt
Hope that helps
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Improving Crawl Efficieny
Hi I'm reading about crawl efficiency & have looked in WMT at the current crawl rate - letting Google optimise this as recommended. What it's set to is 0.5 requests every 2 seconds, which is 15 URLs every minute. To me this doesn't sound very good, especially for a site with over 20,000 pages at least? I'm reading about improving this but if anyone has advice that would be great
Intermediate & Advanced SEO | | BeckyKey1 -
Stolen website content
Hello, recently we had a lot of content written for our new website. Unfortunately me and my partner have went separate ways, and he has used all my unique content on his own website. All our product descriptions, about us etc, he simply changed the name of the company. He has agreed to take the content down, so that i can now put this content on our new website which is currently being designed. Will google see this as duplicate content as it has been on a website before? Even though the content has been removed from the original website. I was worried as the content is no longer "fresh" so to speak. Can any one help me with this,
Intermediate & Advanced SEO | | Alexogilvie0 -
How important is the optional <priority>tag in an XML sitemap of your website? Can this help search engines understand the hierarchy of a website?</priority>
Can the <priority>tag be used to tell search engines the hierarchy of a site or should it be used to let search engines know which priority to we want pages to be indexed in?</priority>
Intermediate & Advanced SEO | | mycity4kids0 -
If other websites implement our RSS feed sidewide on there website, can that hurt our own website?
Think about the switching anchors from the backlinks and the 100s of sidewide inlinks... I gues Google will understand that it's just a RSS feed right?
Intermediate & Advanced SEO | | Zanox0 -
Websites with same content
Hi, Both my .co.uk and .ie websites have the exact same content which consists of hundreds of pages, is this going to cause an issue? I have a hreflang on both websites plus google webmaster tools is picking up that both websites are targeting different counties. Thanks
Intermediate & Advanced SEO | | Paul780 -
If I had an issue with a friendly URL module and I lost all my rankings. Will they return now that issue is resolved next time I'm crawled by google?
I have 'magic seo urls' installed on my zencart site. Except for some reason no one can explain why or how the files were disabled. So my static links went back to dynamic (index.php?**********) etc. The issue was resolved with the module except in that time google must have crawled my site and I lost all my rankings. I'm nowher to be found in the top 50. Did this really cause such an extravagant SEO issue as my web developers told me? Can I expect my rankings to return next time my site is crawled by google?
Intermediate & Advanced SEO | | Pete790 -
Multiple language websites help in pagerank?
My website is in portuguese. If I make a english version and include it at google.com, it will increase my overall page rank?
Intermediate & Advanced SEO | | Naghirniac0 -
Should I Combine 30 websites into one?
I have a Private health care company that I have just begun consulting for. Currently in addition to the main website serving the whole group, 30 individual sites which are for each of the hospitals in their group. Each has it's own domain. Each site, has practically identical content: something that will be addressed in my initial audits. But should I suggest that they combine all the sites into one domain, providing individual category pages for each hosptial, or am I really going to suggest that each of the 30 sites, create unique content of their own. This means thirty pages of content on "hip replacements" thirty different versions of "our treatement" etc, and bearing in mind they all run off the same CMS, even with different body text, the pages are going to be practically identical. It's a big call either way! The reason they started out with all these sites, is that each hospital is it's own cost centre and whilst the web development team is a centralized resource. They each have their own sites to try and rank indivdually for local searches, naturally as they will each tend to get customers from their own local area. Not every hospital provides the full range of treatments.
Intermediate & Advanced SEO | | Ultramod0