How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will google merge structured data from two pages if they have the same canonical?
Will google merge structured data from two pages if they have the same canonical? The crawler should be able to get to the tab through an ahref. The tab in question is "Cast & Crew." Thank you in advance for any insight! szmOmj8.jpg uM8qUfi.jpg
Intermediate & Advanced SEO | | catbur0 -
Cache and index page of Mobile site
Hi, I want to check cache and index page of mobile site. I am checking it on mobile phone but it is showing the cache version of desktop. So anybody can tell me the way(tool, online tool etc.) to check mobile site index and cache page.
Intermediate & Advanced SEO | | vivekrathore0 -
Location Pages On Website vs Landing pages
We have been having a terrible time in the local search results for 20 + locations. I have Places set up and all, but we decided to create location pages on our sites for each location - brief description and content optimized for our main service. The path would be something like .com/location/example. One option that has came up in question is to create landing pages / "mini websites" that would probably be location-example.url.com. I believe that the latter option, mini sites for each location, would be a bad idea as those kinds of tactics were once spammy in the past. What are are your thoughts and and resources so I can convince my team on the best practice.
Intermediate & Advanced SEO | | KJ-Rodgers0 -
Google indexed wrong pages of my website.
When I google site:www.ayurjeewan.com, after 8 pages, google shows Slider and shop pages. Which I don't want to be indexed. How can I get rid of these pages?
Intermediate & Advanced SEO | | bondhoward0 -
How long takes to a page show up in Google results after removing noindex from a page?
Hi folks, A client of mine created a new page and used meta robots noindex to not show the page while they are not ready to launch it. The problem is that somehow Google "crawled" the page and now, after removing the meta robots noindex, the page does not show up in the results. We've tried to crawl it using Fetch as Googlebot, and then submit it using the button that appears. We've included the page in sitemap.xml and also used the old Google submit new page URL https://www.google.com/webmasters/tools/submit-url Does anyone know how long will it take for Google to show the page AFTER removing meta robots noindex from the page? Any reliable references of the statement? I did not find any Google video/post about this. I know that in some days it will appear but I'd like to have a good reference for the future. Thanks.
Intermediate & Advanced SEO | | fabioricotta-840380 -
Interlinking from unique content page to limited content page
I have a page (page 1) with a lot of unique content which may rank for "Example for sale". On this page I Interlink to a page (page 2) with very limited unique content, but a page I believe is better for the user with anchor "See all Example for sale". In other words, the 1st page is more like a guide with items for sale mixed, whereas the 2nd page is purely a "for sale" page with almost no unique content, but very engaging for users. Questions: Is it risky that I interlink with "Example for sale" to a page with limited unique content, as I risk not being able to rank for either of these 2 pages Would it make sense to "no index, follow" page 2 as there is limited unique content, and is actually a page that exist across the web on other websites in different formats (it is real estate MLS listings), but I can still keep the "Example for sale" link leading to page 2 without risking losing ranking of page 1 for "Example for sale"keyword phrase I am basically trying to work out best solution to rank for "Keyword for sale" and dilemma is page 2 is best for users, but is not a very unique page and page 2 is very unique and OK for users but mixed up writing, pictures and more with properties for sale.
Intermediate & Advanced SEO | | khi50 -
Should We Add the W3.org Language Tag To Every Page Or Just The Home Page?
Greetings, We have five international sites around the world, two of which are in difference languages. Currently we have the following line of html code on the home page of each of the sites: Clearly, we need to change the "en" portion for the sites that aren't in English, but, should we include that meta tag in each of the site's pages, or will the home page suffice. Thanks!
Intermediate & Advanced SEO | | CSawatzky0 -
Google crawled my rich snippet pages and then excluded them
Hi guysWe have added schema.org mark up a few months ago and it all looked well and showed up then suddenly last month all the crawled pages disappeared from Webmaster tools Structured data (see the screenshot attached). This happened to another site of mine and I cannot figure out what causes it. Nothing has been changed on the pages and you can see by yourself in the HTML code. Any ideas to why this might happened this way?wenR89I.png?1
Intermediate & Advanced SEO | | Walltopia0