How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it ok to repeat a (focus) keyword used on a previous page, on a new page?
I am cataloguing the pages on our website in terms of which focus keyword has been used with the page. I've noticed that some pages repeated the same keyword / term. I've heard that it's not really good practice, as it's like telling google conflicting information, as the pages with the same keywords will be competing against each other. Is this correct information? If so, is the alternative to use various long-winded keywords instead? If not, meaning it's ok to repeat the keyword on different pages, is there a maximum recommended number of times that we want to repeat the word? Still new-ish to SEO, so any help is much appreciated! V.
Intermediate & Advanced SEO | | Vitzz1 -
Irrelevant Landing Pages are Ranking on Google SERP
Hi, I have noticed that Google likes to rank random pages on my site higher in the SERPs than the actual relevant content page for that service. Please let me know why it is happening?
Intermediate & Advanced SEO | | RuchiPardal0 -
Is it OK to Delete a Page and Move Content to a Another Page without 301 re-direct
I have a page "A" that I want to completely delete and move the written content from A" to page "B". Since I am deleting "A" (not keeping page) is it OK to upload the content from "A" to page "B" and search engines will give "B" credit for the unique content? Or, since the content has already once been indexed on "A", "B" may struggle to get full credit for this new unique content, even though page "A" is deleted?
Intermediate & Advanced SEO | | khi50 -
What's the best way to check Google search results for all pages NOT linking to a domain?
I need to do a bit of link reclamation for some brand terms. From the little bit of searching I've done, there appear to be several thousand pages that meet the criteria, but I can already tell it's going to be impossible or extremely inefficient to save them all manually. Ideally, I need an exported list of all the pages mentioning brand terms not linking to my domain, and then I'll import them into BuzzStream for a link campaign. Anybody have any ideas about how to do that? Thanks! Jon
Intermediate & Advanced SEO | | JonMorrow0 -
Google Analytics: how to filter out pages with low bounce rate?
Hello here, I am trying to find out how I can filter out pages in Google Analytics according to their bounce rate. The way I am doing now is the following: 1. I am working inside the Content > Site Content > Landing Pages report 2. Once there, I click the "advanced" link on the right of the filter field. 3. Once there, I define to "include" "Bounce Rate" "Greater than" "0.50" which should show me which pages have a bounce rate higher of 0.50%.... instead I get the following warning on the graph: "Search constraints on metrics can not be applied to this graph" I am afraid I am using the wrong approach... any ideas are very welcome! Thank you in advance.
Intermediate & Advanced SEO | | fablau0 -
Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
Hi! I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us. The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this: User-agent: googlebot Disallow: /community/photos/ Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible? Thanks! Leona
Intermediate & Advanced SEO | | HD_Leona0 -
Meeting Google's needs 100% with dynamic pages
We have bought into a really powerful search, very exciting We can define really detailed product based 'landing pages' by creating a search that pulles on required attributeseghttp://www.OURDOMAIN.com//search/index.php?sortprice=asc&followSearch=9673&q=red+coats+short-length Pop that in a link Short Red Coats on a previous page and wonderful, that gives a page of short red coats in price ascending order, one happy consumer, straight to a page that meets their needs Question 1 however unhappy Google right? Question 2 can we meet Google's needs 100% with a redirect permanent in an .htaccess file E.G redirect permanent /short-red-coats/ http://www.OURDOMAIN.com//search/index.php?sortprice=asc&followSearch=9673&q=red+coats+short-length
Intermediate & Advanced SEO | | GeezerG
Many thanks
CB0 -
What causes internal pages to have a page rank of 0 if the home page is PR 5?
The home page PageRank is 5 but every single internal page is PR 0. Things I know I need to address each page has 300 links (Menu problem). Each article has 2-3 duplicates caused from the CMS working on this now. Has anyone else had this problem before? What things should I look out for to fix this issue. All internal linking is follow there is no page rank sculpting happening on the pages.
Intermediate & Advanced SEO | | SEOBrent0