Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
-
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO?
Anyone know?
Thanks!
~Brett
-
I have blocked the Wayback Machine for a client and not allowed them to index the site. I blocked them via the robots.txt and not Meta NoIndex, and while blocking Wayback Machine it did NOT impact the positions within the targeted Google results.
Hope this helps.
-
Brett,
I am not sure what code you are referring to but what archive.org suggests is blocking their crawler through robots.txt:
User-agent: ia_archiver
Disallow: /The robots.txt file should be in your root directory.
It's explained here: http://archive.org/about/exclude.php
Doing this will not impact your search results or crawl on Google.
V-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexing .com and .co.uk site
Hi, I am working on a site that is experiencing indexation problems: To give you an idea, the website should be www.example.com however, Google seems to index www.example.co.uk as well. It doesn’t seem to honour the 301 redirect that is on the co.uk site. This is causing quite a few reporting and tracking issues. This happened the first time in November 2016 and there was an issue identified in the DDOS protection which meant we would have to point www.example.co.uk to the same DNS as www.example.com. This was implemented and made no difference. I cleaned up the htaccess file and this made no difference either. In June 2017, Google finally indexed the correct URL, but I can’t be sure what changed it. I have now migrated the site onto https and www.example.co.uk has been reindexed in Google alongside www.example.com I have been advised that the http needs to be removed from DDOS which is in motion I have also redirected http://www.example.co.uk straight to https://www.example.com to prevent chain redirects I can’t block the site via robot.txt unless I take the redirects off which could mean that I lose my rankings. I should also mention that I haven't actually lost any rankings, it's just replaced some URLs with co.uk and others have remained the same. Could you please advise what further steps I should take to ensure the correct URL’s are indexed in Google?
Technical SEO | | Niki_10 -
Meta Titles and Meta Descriptions are not Indexing in Google
Hello Every one, I have a Wordpress website in which i installed All in SEO plugin and wrote meta titles and descriptions for each and every page and posts and submitted website to index. But after Google crawl the Meta Titles and Descriptions shown by Google are something different that are not found in Content. Even i verified the Cached version of the website and gone through Source code that crawled at that moment. the meta title which i have written is present there. Apart from this, the same URL's are displaying perfect meta titles and descriptions which i wrote in Yahoo and Bing Search Engines. Can anyone explain me how to resolve this issue. Website URL: thenewyou (dot) in Regards,
Technical SEO | | SatishSEOSiren0 -
How GOOGLE can re-index my site as possible as?
I have facing the question about re-indexing in the google search engine, the case is: i have changed my site meta description but google indexed display part description why?? my site is http://www.green-lotus-trekking.com/everest-base-camp-trek/ whats the problem in meta tag description? Please let me know about this?
Technical SEO | | agsln0 -
Image Indexing Issue by Google
Hello All,My URL is: www.thesalebox.comI have Submitted my image Sitemap in google webmaster tool on 10th Oct 2013,Still google could not indexing any of my web images,Please refer my sitemap - www.thesalebox.com/AppliancesHomeEntertainment.xml and www.thesalebox.com/Hardware.xmland my webmaster status and image indexing status are below, Can you please help me, why my images are not indexing in google yet? is there any issue? please give me suggestions?Thanks!
Technical SEO | | CommercePundit0 -
Will an identical site impact SERP results
I came across two identical sites for two different business owners in the same industry. I'm sure you've seen these. A web company offers individuals in the same profession a template site with the exact same content for each site. All that is different is the domain. i.e. mycompany.com/news/topicsname will have the exact same content, images, tags, etc. as mycompany2.com/news/topicsname. I would assume having the duplicate content, especially if two site owners are in the same town, will ultimately hurt the rankings of at least one site. Is this correct? Thank you for your help.
Technical SEO | | STF0 -
I am trying to block robots from indexing parts of my site..
I have a few websites that I mocked up for clients to check out my work and get a feel for the style I produce but I don't want them indexed as they have lore ipsum place holder text and not really optimized... I am in the process of optimizing them but for the time being I would like to block them. Most of my warnings and errors on my seomoz dashboard are from these sites and I was going to upload the folioing to the robot.txt file but I want to make sure this is correct: User-agent: * Disallow: /salondemo/ Disallow: /salondemo3/ Disallow: /cafedemo/ Disallow: /portfolio1/ Disallow: /portfolio2/ Disallow: /portfolio3/ Disallow: /salondemo2/ is this all i need to do? Thanks Donny
Technical SEO | | Smurkcreative0 -
Google Off/On Tags
I came across this article about telling google not to crawl a portion of a webpage, but I never hear anyone in the SEO community talk about them. http://perishablepress.com/press/2009/08/23/tell-google-to-not-index-certain-parts-of-your-page/ Does anyone use these and find them to be effective? If not, how do you suggest noindexing/canonicalizing a portion of a page to avoid duplicate content that shows up on multiple pages?
Technical SEO | | Hakkasan1 -
Blocking Google from Crawling Parameters
Hi guys: What is the best way to keep Google from crawling certain urls with parameters? I used the setting in Webmaster Tools, but that doesn't seem to be helping at all. Can I use robots.txt or some other method? Thanks! Some examples are: <colgroup><col width="797"></colgroup> www.mayer-johnson.com/category/assistive-technology?manufacturer=179 www.mayer-johnson.com/category/assistive-technology?manufacturer=226 www.mayer-johnson.com/category/assistive-technology?manufacturer=227 <colgroup><col width="797"></colgroup> www.mayer-johnson.com/category/english-language-learners?condition=212 www.mayer-johnson.com/category/english-language-learners?condition=213 www.mayer-johnson.com/category/english-language-learners?condition=214 <colgroup><col width="797"></colgroup>
Technical SEO | | DanaDV
| www.mayer-johnson.com/category/english-language-learners?roles=164 |
| www.mayer-johnson.com/category/english-language-learners?roles=165 |
| www.mayer-johnson.com/category/english-language-learners?roles=197 | | |0