Cannot crawl website with redirect intalled on subdomain url
-
Hi!
I want to crawl this website : http://www.car-moderne.ch.
I tried a got back the crawl just for that one url (not for all the pages of the website). This single line cvs says that the status of the http://www.car-moderne.ch is 200, but in fact it is a redirect 301 to http://www.car-moderne.ch/fr where the live home page is (actually the Moz bar sees the 301, not the 200 as the single-lined crawl does).
How can I proceed in this case (a 301 redirect being installed on the subdomain url) to still be able to have a full-fledged juicy cvs with all the broken links, duplicate content, etc.
Thank you for your help!
Pascal Hämmerli
-
So glad to help, Pascal!
-
Dear Chiaryn,
Thank you for your very helpful reply.
This website is hosted on a partner agency who create the website and I only act as a SEO consultant for them. What you say is very helpful because it means their home-made CMS should be corrected to provided better 301 redirection.
I wish you a good day,
Pascal
-
Hey Pascal,
Sorry for the confusion here! It looks like the subdomain, www.car-moderne.ch, returns a 200 HTTP status to our crawler and to other crawlers, such as the hurl.it tool. In the body of the screenshot I attached from the hurl.it tool, the only code there is the number 404, so basically the site is serving a page with no crawlable data. The page isn't redirecting and it doesn't return any real source code, so there is no data for us to include in the crawl. I would recommend working with your webmaster to resolve this issue and to get the page to correctly serve a 301 redirect to the /fr version of the site to all crawlers.
I can see that the site is correctly responding with a 301 redirect for some crawlers, such as this test I ran as googlebot, but the response doesn't seem to be consistent. One thing you will want to be sure to have your webmaster check is how the site responds to user-agents that are hosted on Amazon Web Services, as some of our crawlers and the hurl.it crawl are both hosted through AWS.
Once the issue of the HTTP response is resolved, you should be able to get much better data from the crawl test tool.
I hope this helps! Please let me know if I can help you with anything else.
Chiaryn
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz bot has trouble crawling Angular JS - I believe it's seeing the SPA (Single Page Application) before Universal. Anyone else have this issue? What is the fix?
The Moz bot user agent detection settings are able to read Universal, but the Single Page Application (SPA) version partially loads on the website before Universal. Because of this, Moz (and possibly search engines) think we have massive duplicate content issues. For example, the crawl report said a particular product page (which has about 1,000 words) has 33,000 words and has duplicate content with over 300 other pages. This makes me believe it's only picking up the SPA version. Has anyone come across this, and what would be the fix?
Moz Bar | | laurengdicenso1 -
New Domain Authority 2.0 have affected my website rankings badly
My website https://www.successvalley.tech/ domain authority was 39 but after the new update, its now 15. Am not happy at all because it took me three years to get that and now it has been reduce to nothing. Please I kindly need explanation as to what happened. Thanks
Moz Bar | | Amenorhu1 -
Limit MOZ crawl rate on Shopify or when you don't have access to robots.txt
Hello. I'm wondering if there is a way to control the crawl rate of MOZ on our site. It is hosted on Shopify which does not allow any kind of control over the robots.txt file to add a rule like this: User-Agent: rogerbot Crawl-Delay: 5 Due to this, we get a lot of 430 error codes -mainly on our products- and this certainly would prevent MOZ from getting the full picture of our shop. Can we rely on MOZ's data when critical pages are not being crawled due to 430 errors? Is there any alternative to fix this? Thanks
Moz Bar | | AllAboutShapewear2 -
Crawl report shows that it gets 4xx errors for pages that work fine. Why?
On the crawl report it has all these "Critical Crawler Issues". They all say "4xx Error", yet when i click on the link from the crawler report, it goes to a perfectly functioning page, not a 404 page or anything. If i click in it actually says it's a 403 error. It's all for pages generated by the IDX solution for our real estate website. Is Moz broken or am i missing something? Here are a couple examples: <dl class="crawl-page-details-list"> <dd class="crawl-page-details-list-emphasis">https://teamvivi.com/homes-for-sale-map-search/</dd> <dd class="crawl-page-details-list-emphasis"> <dl class="crawl-page-details-list"> <dd class="crawl-page-details-list-emphasis">https://teamvivi.com/email-alerts/</dd> </dl> </dd> </dl>
Moz Bar | | TeamViviRealEstate0 -
Is there a way to export all your crawl errors for multiple Moz campaigns at once?
We're looking for a simple way to export all crawl errors for our Moz campaigns. More than likely we could use the API, but was wondering if there was any functionality already built into Moz for exporting all crawl errors.
Moz Bar | | ReunionMarketing0 -
Moz Site Crawl Test 404
Crawled site a number of times using Crawl Test. Its reporting 404's from files that are actually present. What do you make of this? Justin
Moz Bar | | GrouchyKids0 -
Site crawl errors - download list of all urls
Hi Ive provided my clients developers with the pdf reports of crawl errors but these seem to miss some urls I see there are lots of csv file download/email options Will the email csv button send a report of everything listing all urls that are missing from the pdfs ? if not will the more specific csv reports Would be good if i can press 1 button and get all issues listed with all urls It does look like this happens but i just want confirmed best way asap since need to provide reports urgently, any guidance much appreciated ? All Best Dan
Moz Bar | | Dan-Lawrence0 -
Rel Canonical and Moz Crawl
we have Rel Canonical tags set up on a few pages. When viewing the page source, the tags are correct. However, Moz Crawl results show the opposite. for example the page source, correctly shows, URL X with a Rel canonical Tag of URL Y
Moz Bar | | S.S.N
but.. Moz crawl is showing URL Y with a Rel Canonical Tag of URL X ..any thoughts why this would happen? which should i trust more?0