Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
-
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO?
Anyone know?
Thanks!
~Brett
-
I have blocked the Wayback Machine for a client and not allowed them to index the site. I blocked them via the robots.txt and not Meta NoIndex, and while blocking Wayback Machine it did NOT impact the positions within the targeted Google results.
Hope this helps.
-
Brett,
I am not sure what code you are referring to but what archive.org suggests is blocking their crawler through robots.txt:
User-agent: ia_archiver
Disallow: /The robots.txt file should be in your root directory.
It's explained here: http://archive.org/about/exclude.php
Doing this will not impact your search results or crawl on Google.
V-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Weird Google indexing issues with www being forced
IM working on a site which is really not indexing as it should, I have created a sitemap.xml which I thought would fix the issue but it hasn't, what seems to be happening is the Google is making www pages canonical for some of the site and without www for the rest. the site should be without www. see images attached for a visual explanation.
Technical SEO | | Donsimong
when adding pages in Google search console without www some pages cannot be indexed as Google thinks the www version is canonical, and I have no idea why, there is no canonical set up at all, what I would do if I could is to add canonical tags to each page to pint to the non www version, but the CMA does not allow for canonical. not quite sure how to proceed, how to tell google that the non www version is in fact correct, I dont have any idea why its assuming www is canonical either??? k11cGAv zOuwMxv0 -
Why Google crawl parameter URLs?
Hi SEO Masters, Google is indexing this parameter URLs - 1- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-6109-4191-6110&mode=li_23&p=2&filterable_stone_shapes=4114 2- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-4169-4195&mode=li_23&p=2&filterable_stone_shapes=4115&filterable_metal_types=4163 I have handled by Google parameter like this - jewelry_styles= Narrows Let Googlebot decide mode= None Representative URL p= Paginates Let Googlebot decide filterable_stone_shapes= Narrows Let Googlebot decide filterable_metal_types= Narrows Let Googlebot decide and Canonical for both pages - xyz.com/f1/f2/page?p=2 So can you suggest me why Google indexed all related pages with this - xyz.com/f1/f2/page?p=2 But I have no issue with first page - xyz.com/f1/f2/page (with any parameter). Cononical of first page is working perfectly. Thanks
Technical SEO | | Rajesh.Prajapati
Rajesh0 -
New SEO manager needs help! Currently only about 15% of our live sitemap (~4 million url e-commerce site) is actually indexed in Google. What are best practices sitemaps for big sites with a lot of changing content?
In Google Search console 4,218,017 URLs submitted 402,035 URLs indexed what is the best way to troubleshoot? What is best guidance for sitemap indexation of large sites with a lot of changing content? view?usp=sharing
Technical SEO | | Hamish_TM1 -
What will the SEO and conversion implications of an ecommere website not moving to https be?
Client has an ecommerce website on Adobe Business Catalyst and they are currently not able to move websites onto https. They have announced this won't be available until late June. What are the expected implications of this in terms of site visibility and conversions etc?
Technical SEO | | Kerry_Jones0 -
Redirect of https:// to http:// without SSL. Possible or not?!
Good afternoon, smart dudes : ) I am here to ask for your help. I posted this question on google help forum and stackoverflow, but looks like people do not know the correct answer... QUESTION: We used to have a secured site, but recently purchased a separate reservation software that provides SSL (takes clients to a separate secured website) where they can fill out the reservation form. We cancelled our SSL (just think its a waste to pay $100 for securing plain text). Now i have so many links pointing to our secured site and i have no idea how to fix it! How do i redirect https://www.mysite.comto http://www.mysite.com.Also would like to mention that i already have redirect from non www to www domain (not sure if that matters): RewriteEngine onRewriteCond %{HTTP_HOST} ^mysite.com$ [NC]RewriteRule ^(.*)$ http://www.mysite.com/$1 [R=301,L]As i already mentioned....we do not have SSL!!!! None of those 301 redirect codes i found online work (you have to have SSL for the site to be redirected from https to http | currently i get an error - can't establish a secured connection to the server ). Is there anything i can do???? Or do i have to purchase SSL again?
Technical SEO | | JennaD140 -
Image Indexing Issue by Google
Hello All,My URL is: www.thesalebox.comI have Submitted my image Sitemap in google webmaster tool on 10th Oct 2013,Still google could not indexing any of my web images,Please refer my sitemap - www.thesalebox.com/AppliancesHomeEntertainment.xml and www.thesalebox.com/Hardware.xmland my webmaster status and image indexing status are below, Can you please help me, why my images are not indexing in google yet? is there any issue? please give me suggestions?Thanks!
Technical SEO | | CommercePundit0 -
Robots.txt to disallow /index.php/ path
Hi SEOmoz, I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?. I don't use that extension, but would it cause me any problems from an SEO perspective? How do I disallow all index.php's? Is it a simple: Disallow: /index.php/
Technical SEO | | Mikkehl0 -
I cannot find a way to implement to the 2 Link method as shown in this post: http://searchengineland.com/the-definitive-guide-to-google-authorship-markup-123218
Did Google stop offering the 2 link method of verification for Authorship? See this post below: http://searchengineland.com/the-definitive-guide-to-google-authorship-markup-123218 And see this: http://www.seomoz.org/blog/using-passive-link-building-to-build-links-with-no-budget In both articles the authors talk about how to set up Authorship snippets for posts on blogs where they have no bio page and no email verification just by linking directly from the content to their Google+ profile and then by linking the from the the Google+ profile page (in the Contributor to section) to the blog home page. But this does not work no matter how many ways I trie it. Did Google stop offering this method?
Technical SEO | | jeff.interactive0